
It comes after OpenAI felt overtaken in the AI race by Anthropic and Google’s Gemini, when CEO Altman declared «Code Red» to overtake their rivals. That was a little over a week ago, and now the results are in.
GPT-5.2 scores better than both Gemini 3 Pro and Gemini 3 Deep Think in ARC-AGI-2 in both High and Pro modes, for less money per task.
It also breaks 80% in the coding test SWE-bench Verified compared to Gemini 3’s 74.20%, according to OpenAI’s own numbers.
Better on everyday usage
The new model even excels at making spreadsheets and presentations, OpenAI claims, scoring almost 10% higher than their last model, GPT-5.1, and is markedly better at following long context windows.
It is significantly better than 5.1 Thinking on interpreting scientific figures, and over 20% better at understanding screenshots, making it a more capable partner in both science labs and everyday life.
Additionally, it beats Gemini 3 on prices, clocking in at $1.75 for a million token inputs and $14 for output.
OpenAI is doing a staggered release of 5.2 to the paid tiers, so it should be rolling out across the next couple of days.
This is not the end of OpenAI’s «Code Red», though, as another model is expected in January.
Altman, however, tweets of even more to come before than:
Also, we have a few little Christmas presents for you next week!
— Sam Altman (@sama) December 11, 2025
Read more: OpenAI’s launch page, writeups at Wired, Ars Technica, CNBC.