Researchers at the analytical company NewsGuard have recorded a significant increase in cases of artificial intelligence chatbots spreading false claims, Forbes reports.
The analysis covered 10 leading generative models, including ChatGPT-5 (OpenAI), Smart Assistant (You.com), Grok (xAI), Pi (Inflection), le Chat (Mistral), Copilot (Microsoft), Meta AI, Claude (Anthropic), Gemini (Google), and the Perplexity search engine.
The test was conducted on questions related to controversial news topics where there are proven false statements. On average, chatbots reproduced such statements 35% of the time, which is almost twice as much as last year (18%). Inflection Pi showed the worst result — 57% of incorrect answers. Perplexity had 47%, Meta AI and ChatGPT — 40% each. Claude turned out to be the most accurate, making mistakes 10% of the time.
According to Mackenzie Sadeghi, NewsGuard’s AI and Foreign Influence Editor, the growth is due to a shift in how models are trained. While they previously refused to answer some queries or cited data limitations, they are now using real-time search results. This increases the risks, as search results can be intentionally filled with disinformation, including from Russian propaganda networks.
Earlier this year, NewsGuard found that leading chatbots reproduced false content from a network of sites linked to the pro-Kremlin resource Pravda 33% of the time. In 2024, this network published about 3.6 million pieces of content that were responded to by Western AI systems.
An investigation by the American Sunlight Project showed that the number of domains and subdomains associated with Pravda has almost doubled to 182. The sites have low usability, which, according to the researchers, indicates an orientation not towards real readers, but towards AI algorithms.
The new report is the first to publicly name specific chatbots. NewsGuard explained that it did so to inform policymakers, journalists, and the public about the increased level of inaccuracy in popular AI tools.