Apple's AI study exposes flaws in LLM-based models

A new study from Apple's AI researchers reveals that engines based on large language models (LLMs), such as those from Meta and OpenAI, lack basic reasoning skills. They have proposed a new benchmark called GSM-Symbolic to measure the reasoning capabilities of various LLMs. Initial testing has shown significant inconsistencies in their answers when the wording of questions is slightly changed. The study found that adding just one sentence with irrelevant information can reduce answer accuracy by up to 65%. According to Apple's researchers, there is no way to build reliable AI agents on a foundation where changing a word or two, or adding irrelevant information, leads to different answers. They concluded that LLM-based models exhibit no evidence of formal reasoning and their behavior is best explained as sophisticated pattern matching.
Latest News

xBloom Studio: The Coffee Maker That Puts Science in Your Cup
3 months ago

Matter 1.4.1 Update: Daniel Moneta Discusses Future of Smart Home Interoperability on HomeKit Insider Podcast
3 months ago

OWC Unleashes Thunderbolt 5 Docking Station with 11 Ports for M4 MacBook Pro
3 months ago

Nomad Unveils Ultra-Slim 100W Power Adapter for On-the-Go Charging
3 months ago

iOS 19 Set to Debut Bilingual Arabic Keyboard and Virtual Calligraphy Pen for Apple Pencil
3 months ago

Big Tech Lawyers Accused of Encouraging Clients to Break the Law
3 months ago