AI Chatbot Study Reveals Troubling Inaccuracies, But Bright Spot for Apple and ChatGPT
A study by the Tow Center for Digital Journalism evaluated eight AI chatbots on their ability to find and accurately cite articles from quotes provided to them. The bots, including ChatGPT, Perplexity, DeepSeek, Microsoft’s Copilot, Grok-2, Grok-3, and Gemini, generally performed poorly, with an average accuracy rate of less than 40%. Perplexity was the most accurate at 63%, while X's Grok-3 was the least accurate at only 6%. The study found that chatbots were often confident about incorrect information, bypassed web scraping restrictions, and fabricated links. Despite these issues, ChatGPT performed relatively well compared to other bots. The article concludes by advising against using AI chatbots for factual information but suggests they can be useful for inspiration and ideas.