Apple's AI study exposes flaws in LLM-based models

A new study from Apple's AI researchers reveals that engines based on large language models (LLMs), such as those from Meta and OpenAI, lack basic reasoning skills. They have proposed a new benchmark called GSM-Symbolic to measure the reasoning capabilities of various LLMs. Initial testing has shown significant inconsistencies in their answers when the wording of questions is slightly changed. The study found that adding just one sentence with irrelevant information can reduce answer accuracy by up to 65%. According to Apple's researchers, there is no way to build reliable AI agents on a foundation where changing a word or two, or adding irrelevant information, leads to different answers. They concluded that LLM-based models exhibit no evidence of formal reasoning and their behavior is best explained as sophisticated pattern matching.
Latest News

xBloom Studio: The Coffee Maker That Puts Science in Your Cup
4 months ago

Moto Watch Fit Priced at $200: Is It Worth the Cost for Fitness Enthusiasts?
4 months ago

iOS 18's Subtle but Significant Privacy Boost: Granular Contact Sharing Control
4 months ago

Walmart Unveils Onn 4K Plus: The Affordable $30 Google TV Streaming Device
4 months ago

Judge Forces Apple to Comply: Epic Games' Fortnite Returns Hinge on Court Order
4 months ago

OnePlus Unveils the ‘Plus Key’: Is It Just an iPhone Knockoff or Something Revolutionary?
4 months ago