Anthropic launches Claude Opus 4.5, with top scores in most benchmarks

Anthropic's new model is state of the art in most benchmarks.
Opus 4.5 beats every human candidate on Anthropic’s onboarding exam for engineers. (Picture: Anthropic)
Billing it as the «best model in the world for coding, agents, and computer use,» Opus 4.5 is indeed state-of-the-art in software engineering.

It scores 80.9% in SWE-bench Verified, the preferred benchmark for coding lately. Gemini 3 has 76.2% in this bench, and GPT-5.1-Codex-Max registers at 77.9%.

Continue reading “Anthropic launches Claude Opus 4.5, with top scores in most benchmarks”