OpenAI introduces GPT-5.1-Codex-Max, defeating Gemini 3 on some benches

Codex-Max reaches parity with Gemini 3, just a day after launch. (Picture: Screenshot, OpenAI)
OpenAI’s new coding model outperforms «state of the art» Gemini 3 from just yesterday, in some select benchmarks — and seems to be on par at SWE-Bench Verified.

— GPT‑5.1-Codex-Max is faster, more intelligent, and more token-efficient at every stage of the development cycle–and a new step towards becoming a reliable coding partner, says OpenAI in their launch post.

It has been observed by the AI lab to work independently on tasks for more than 24 hours, iterating on its implementations and delivering a «successful result.»

Codex-Max is also the first OpenAI model trained in a Windows environment, and will achieve better performance than the previous GPT-5.1-Codex using 30% fewer tokens — meaning it’s cheaper and more efficient.

Read more: The launch post, VentureBeat. Discussion on r/Singularity.