Microsoft’s Agent Army Dominates Cybersecurity Benchmark with Record-Breaking MDASH System
Microsoft has unveiled MDASH, a multi-agent AI system that has set a new standard in cybersecurity by achieving an 88.45% success rate on the CyberGym benchmark. Unlike traditional single-model approaches, MDASH utilizes a complex network of over 100 specialized agents working across multiple LLMs to identify and analyze software vulnerabilities. This collaborative architecture allowed it to surpass high-profile competitors like Anthropic’s Mythos and top-tier models from OpenAI, signaling a major shift toward multi-model systems in the race for automated digital security.