Challenges of Relying on AI for News Reporting
Relying on artificial intelligence to relay news may not be the most reliable approach, particularly as highlighted in a recent BBC report. This study evaluated the capabilities of platforms such as OpenAI, Google Gemini, Microsoft Copilot, and Perplexity in disseminating news, revealing that approximately “51% of all AI-generated responses” exhibited “considerable inaccuracies.”
The analysis consisted of posing 100 news-related questions to each AI, utilizing BBC sources when possible. Experts in journalism then assessed the answers provided.
Among the various discrepancies noted, Gemini inaccurately claimed that the UK’s NHS (National Health Service) does not endorse vaping as a smoking cessation method (which it does). Additionally, both ChatGPT and Copilot misrepresented the statuses of politicians who had vacated their positions, suggesting they were still in office. More alarmingly, Perplexity mischaracterized a BBC article discussing Iran and Israel by misattributing opinions and perspectives to the writer and their sources that were not present in the original text.
Specifically regarding its own AI-generated content, the BBC reported that 19% of summaries produced by AI were factually incorrect, fabricating statements, figures, and dates. Furthermore, 13% of quoted material was either modified from its original context or entirely absent from the articles cited.
While inaccuracies were not evenly distributed among the AI tools, none demonstrated particularly commendable performance overall.
According to the BBC, “Microsoft’s Copilot and Google’s Gemini were found to have more substantial problems compared to OpenAI’s ChatGPT and Perplexity.” On the other hand, both Perplexity and ChatGPT encountered issues in over 40% of their responses.
In a blog post, BBC CEO Deborah Turness expressed strong concerns regarding the tested AI systems. She noted that while artificial intelligence holds “limitless potential,” the current applications are “risking significant dangers.”
Turness stated, “We exist in precarious times. How long will it be before a headline distorted by AI leads to real-world consequences?”
This report is not the BBC’s first critique of AI-generated news summaries; past investigations have arguably prompted Apple to discontinue its AI news summaries just last month.
Additionally, journalists have previously clashed with Perplexity over copyright issues, as accusations arose claiming that the tool bypassed paywalls, leading the New York Times to issue a cease-and-desist letter. The parent company of the New York Post and The Wall Street Journal has taken legal action further by suing Perplexity.
To carry out its evaluation, the BBC briefly relaxed constraints that barred AI from accessing its website content, though these restrictions have since been reinstated. Nonetheless, despite these obstacles and Turness’ pointed critiques, the BBC does not categorically reject AI.
“We wish for AI developers to acknowledge our concerns and collaborate with us,” the BBC’s study expresses. “Our aim is to understand how these organizations intend to address the problems we have highlighted and to engage in discussions about adopting the right long-term strategies to ensure the reliability and credibility of AI tools. We are open to partnering with them to achieve this objective.”