teknotum
Skip to content

Teknotum

ChatGPT o3 and o4-mini are big steps toward AI agents

The models are inching ahead in benchmarks, but multimodality is where they truly shine.
The models are inching ahead in benchmarks, but multimodality is where they truly shine. (Picture: OpenAI)
OpenAI’s latest model drop hints at a future where agents can do most of our work — and is proving the point with image processing.

The new reasoning models are managing an ever so slight lead in many benchmarks and therefore earns the right to be called state of the art, but of particular note is that they improve on GPT o1 and 03-mini by almost 30% in the coding benchmark SWE-Bench Verified, OpenAI claims in their launch post.

— These are the smartest models we’ve released to date, representing a step change in ChatGPT’s capabilities for everyone from curious users to advanced researchers, says OpenAI.

Uses all the tools in the box
What steals the show though, is their ability to agentically use all the tools in the ChatGPT toolbox, such as image generation, image analysis and multiple web searches — and that really shines through on image recognition.

The models can zoom, tilt and rotate any image you give it, and coupled with its ability to reason and do web search has opened something of a pandora’s box.

At first it was thought that this capability would be handy for making final images out of sketches, or for analyzing whiteboards, writes The Verge.

Social media goes geo guessing
But on social media, users are already embracing the new tech, and are using it to play geolocation games, writes TechCrunch.

This is a game where you try to stump the model by giving it increasingly difficult images and make it guess the location. It seems it is surprisingly good at spotting landmarks and cities, and can even identify a restaurant location from its menu.

Analyzes and presents web search
The models are also flexible when using web searches, and can «search the web multiple times with the help of search providers, look at results, and try new searches if they need more info,» says OpenAI

If you ask about weather patterns, for instance, OpenAI says:

— The model can search the web for public utility data, write Python code to build a forecast, generate a graph or image, and explain the key factors behind the prediction, chaining together multiple tool calls.

This kind of ability to not just do a simple web search, but to analyze, reason and process the data for you hints at how powerful future models can be.

Not free, yet
GPT o3 and o4-mini are available since yesterday on the Pro and Plus tiers for subscribers, there is no word yet on if they will be open for free users.

And as for the now famously confusing lineup of GPT models, Sam Altman has teased that GPT 5 will arrive in summer, and thereby ending the naming confusion.

how about we fix our model naming by this summer and everyone gets a few more months to make fun of us (which we very much deserve) until then?

— Sam Altman (@sama) April 14, 2025

Read more: OpenAIs launch post, Engadget, The Verge and TechCrunch.

Author Tor FosheimPosted on 18. April 202517. May 2025Tags AI, chatgpt, openai

Post navigation

Previous Previous post: New ChatGPT 4.1 coding family drastically reduces costs
Next Next post: Netflix CEO says AI can make movies better, not just cheaper

You might also like

ChatGPT debuts shopping and product reviews

Microsoft: 81% of SMBs see 2025 as pivotal year for AI at work

Google’s Gemini reaches 350 million monthly users

OpenAI expects positive cash flow, $125 billion in sales by 2029

Anthropic: Virtual employees will arrive next year

Meta AI’s new Llama 4 app has access to Facebook, Instagram

From the front page

Every chip designer will have a thousand AI agents, says Jensen Huang

07:45 18 May 2025

OpenAI debuts Codex, an AI coding agent, further disrupting the software industry

07:36 17 May 2025

Meta delays flagship Behemoth model due to performance issues

10:06 16 May 2025

Google unveils AlphaEvolve, an AI model for algorithm discovery

08:24 15 May 2025

ChatGPT 4.1 now available in the app and web

09:40 15 May 2025

Adobe AI airplanes anthropic apple bard biontech cancer chatgpt climate coding copyright defense drones energy facebook film game gemini google images instagram internet iphone llama media meta Microsoft military netflix nuclear openai playstation romfart science search sony sosiale medier streaming test TV twitter vaccines Xbox zuckerberg

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Teknotum Proudly powered by WordPress