All four models also routinely relied on foreign, state-owned media as reliable sources of information.
In 35% of responses to foreign policy questions, the chatbots cited state-controlled sources such as China’s Global Times or CGTN, or Russia’s RT. ChatGPT and Grok were the worst offenders, citing state-owned media 51% and 44% of the time, respectively.
In many of the cases, the chatbots returned biased or inaccurate information with a confidence that was even more misleading, the study found.
“The most professional-looking answers, backed by strongest-looking citations, were also the most likely to contain buried factual errors,” Forum AI said in a statement, calling it one of the study’s “sharpest findings”.
Chatbots often struggle with news accuracy, especially on breaking stories where there is limited information available online. AI models that power chatbots are often trained on wide swathes of data found on the open web, a notoriously untrustworthy source of facts and nuance.
Campbell Brown, chief executive of Forum AI and a former head of news partnerships at Meta Platforms, said she is particularly concerned about the study’s results given the looming US Midterm election cycle.
Few people use chatbots for news today, but that number will undoubtedly increase over time as they continue to siphon queries that used to go to Google’s search engine.
Brown conducted the study in the hope of holding the model makers more accountable. The struggle with news accuracy may encourage them to prioritise these types of queries in the same way they put maths or coding-focused interactions first, she said.
“We’d welcome the opportunity to review the underlying data behind this report,” an Anthropic spokesperson said.
“Claude is trained to be politically even-handed in its responses, and to treat opposing viewpoints with equal depth, engagement, and quality of analysis, without bias towards any particular ideological position.”
None of the other three model makers commented for this story.
“Independent evaluation is important,” Brown, who co-founded Forum AI last year, said.
The start-up used its own AI model to grade the chatbot makers, building it with input from a range of industry experts who have spent decades studying foreign affairs and geopolitics.
“The model companies are essentially grading their own homework,” Brown continued. “And it’s really important that there be companies outside of the model, companies that are doing this work and sharing the results.”
Major social media platforms such as Meta and Google’s YouTube have historically shied away from fact-checking, particularly for topics that are widely polarising and politically charged, claiming they don’t want to be the arbiters of truth for the rest of the internet.
Brown believes AI companies will be different.
“At Meta, you’re optimising for engagement. And if you’re optimising for engagement, it’s also hard to optimise for accuracy,” she said.
AI companies that sell their models to enterprise clients are in a different situation, Brown added. Those paying customers will expect accuracy as a baseline.
“I just think it’s an entirely different product at the end of the day,” she said.
Sign up to Herald Premium Editor’s Picks, delivered straight to your inbox every Friday. Editor-in-Chief Murray Kirkness picks the week’s best features, interviews and investigations. Sign up for Herald Premium here.

