Google’s AI Overview is wrong tens of millions of times per hour, NYT finds

Google gets it right more often than not, but 1 in 10 queries result in errors. (Picture: Adobe)
Sure, the measured accuracy by AI lab Oumi ticks in at a decent 90% — but when you scale it up to the sheer mass of Google’s traffic of more than five trillion searches per year, you get mind-boggling numbers of hundreds of thousands of «inaccuracies» per minute, the New York Times writes.

The test, conducted on the Gemini 3 generation of AI Overviews, was made using the SimpleQA dataset intended to probe for chatbot accuracy. It contains more than 4,000 questions with real, verifiable answers, that was made by OpenAI in 2024, Ars Technica reports.

Google’s AI answer machine pops up on every query these days, but it is difficult to tell precisely which model it uses for each task. For simple web searches, it might well opt for one of the faster Flash models rather than the more advanced Pro. It might also give different answers to the same question just milliseconds apart.

Google also doesn’t like the measurement being used, telling the NYT that «This study has serious holes. It doesn’t reflect what people are actually searching on Google.»

Read more: New York Times, Ars Technica.

Anthropic reaches $30B revenue, gets compute from Google and Broadcom

Anthropic continues to diversify its compute needs. (Picutre: Anthropic)
Anthropic now says it has a run-rate revenue of $30 billion, up from $14 billion in February during their last fundraising.

They are also announcing that they are brining in new compute capacity, based on next generation Google TPUs that will start coming online in 2027.

The companies offer no detail the cost of the «partnership» or how much compute they are actually buying, but Broadcom Is hinting it’s around 3.5 GW, according to CNBC.

Anthropic also say they have doubled the rate of customers spending more than $1 million per year to 1,000, in just two months.

Claude now runs on Amazon’s Trainium chips, Google TPUs and Nvidia GPUs. The latter are more used, and Amazon remains their primary cloud provider, Anthropic says.

Read more: Anthropic’s announcement, CNBC adds numbers.

Google launches open model Gemma 4, claims best intelligence-per-parameter

With their latest open models, Google is taking a stab at building agents. (Picture: Google)
After Gemma 3 got more than 400 million downloads and 100K variants, they are swinging again with the multimodal Gemma 4 family, released under an Apache 2.0 license.

It comes in 2 billion, 4B, 26B and 31B variants and works on anything from high powered hardware (high parameters) to edge computing and mobile phones (lower parameters).

The 31B edition ranks third on the Arena AI leaderboard for open source models, and the 26B one is sixth, «outperforming models 20x its size,» as Google puts it.

The new models have also been strengthened with agent workflows, and let you build agents to «interact with different tools and APIs and execute workflows,» Google says.

The models are available today for download on Hugging Face and online at Google AI Studio.

Read more: Google’s announcement, launch post, on Nvidia GPUs

Gemini introduces chat and memory imports from competing chatbots

It now seems easier to switch to Gemini, but finding the files to do it can sometimes be difficult. (Picture: Google)
Switching from a chatbot with lots of history to a fresh one can be a pain, which is why Google is now launching new switching tools, that lets you import from other chatbots, with hopes of snagging some extra users from others.

The first step is to simply prompt the bot you are switching from to output your preferences, or its memories, and it will provide them in a prompt reply. This can then be pasted into Gemini.

The second feature will import your entire chat history — up to 5GB of it. Doing this is a little more complicated and involves a trip to the settings panel, but it should result in getting a zip file from your provider, which can be uploaded to Google.

From there on, Gemini promises to pick up right where you left off with the other chatbot, and you won’t have to train a whole new AI. Anthropic already does this.

Read more: Google’s presentation, step-by-step tweet, writeups on Engadget and The Verge.

«Vibe design» by Gemini — Google updates Stitch for the AI age

Design help from Google? If it floats your boat. (Picture Google)
Promising to let «anyone» create layouts with natural language prompts and turn them into «high-fidelity UI designs,» Stitch is supposed to let you «vibe design» your projects.

It is intended to let you «explore ideas quickly» with a «high quality outcome.»

The app can take input from text, images, or code, and provides you with an entire design language that you can pick and choose from, with an «infinite» canvas storing your ideas.

It should be equally good at designing for the web and apps, but does come out as somewhat boilerplate and generic.

I tried to get it to brainstorm a little about improving the design of this webpage, and the results were terrible, but it might be worth it for other projects.

The improved Stitch is available at stitch.withgoogle.com and can be accessed for free anywhere Gemini is available.

Read more: Google’s introduction, launch tweet.

Google’s «Personal Intelligence» now available for free users in the U.S.

Shopping for a bag to go with your shoes? Google already knows. (Picture: Google)
It seems the tie-in between Google’s Calendar, Gmail, Photos, YouTube and Search and Gemini has been popular — and they are now expanding the service to free users.

— People are appreciating the highly tailored help they’re getting in AI Mode in Search and the Gemini app, Google says.

Personal intelligence can be useful for anything that involves your history with Google, like searching for another pair of sneakers you already bought, shopping for a bag to go with said shoes — or are planning a travel itinerary based on past preferences.

You need to be signed into a personal Google account for it to work, and it is not available for Workspace business, Enterprise, or Education users, TechCrunch notes.

The feature is also explicitly opt-in, and you have to choose to turn it on. There are also granular controls for disabling each app or service, so you can opt out of having Gemini scour your previous web searches and use them in replies, for instance.

Read more: Google’s announcement, TechCrunch and 9to5Google.

«Ask Maps» brings Gemini 3 intelligence, personalization to Google Maps

You can now get pretty comprehensive natural language answers from Google Maps. (Picture: Google)
With the latest Maps upgrade, you can ask questions in natural language and have Gemini answer with map-specific information.

The feature is supposed to work great for questions of where to find the nearest restroom, or a cozy vegan restaurant nearby — and it even lets you book a table right from the app.

To achieve this, Gemini will scan information from the Maps database consisting of some 300 million places and reviews from over 500 million contributors to find you just the right spot.

Ask Maps also remembers your previous saved spots or queries, so it will know that you are vegetarian, say, or if you have any special needs or preferences.

Of course, once you find a spot, Maps will help you navigate to get there — and in the biggest update in a decade, you now get a 3D driving experience.

Ask Maps is only available on mobile in the USA and India, with desktop support «coming soon.»

Read more: Google’s announcement, The Verge, Engadget.

Gemini on Chrome expands to more countries and languages

Gemini is offering AI integration in the Chrome browser for even more markets. (Picture: Google/generated)
With som features previously only available for Pro and Ultra subscribers in the USA, the AI features for Chrome are now launching on desktop and mobile in India (the second largest market for American AI), New Zealand and Canada, with promises of more to come.

Gemini in Chrome adds a new side panel, letting you chat with Gemini without opening up a new tab, and can do things like summarize or interact with web pages. It can connect to Gmail, Calendar, YouTube, Shopping and flights information.

It also comes with Nano Banana features, so you can try an apartment listing picture with your own furniture, for example.

In addition to the three new countries, which are mostly English-speaking, Google is announcing support for another 50 languages.

This of course includes Hindi, but there is also support for French, Spanish, Chinese and lots of other European languages.

Read more: Google’s announcement. Writeups on TechCrunch and Engadget.

Google announces slew of Gemini improvements to Workspace

Workspace got smarter, and can now draw on files, emails, chats and the web. (Picture: Google)
Sheets, Slides and Docs are getting some extra help from Gemini in a huge update to the service.

— Today, we’re making Gemini in Docs, Sheets, Slides and Drive more personal, capable and collaborative to help you get things done, faster, Google says.

All these apps can now draw on information from your Drive, Gmail, Chat and web search to draft things like emails and docs, or pull numbers for spreadsheets based on, say, an email conversation, meeting notes or separate sources in Drive. All it takes is a single prompt.

Google is especially proud of their agentic performance on Sheets, getting very close to the human expert benchmark on the SpreadsheetBench dataset.

The features are rolling out to all Ultra and Pro subscribers globally today, but is only available in English. Google is looking to bring on «more languages soon.»

Read more: Google’s announcement, launch thread. Writeups on 9to5Google and TechCrunch.

NotebookLM introduces «cinematic video overviews» feature

The AI learning and note-taking app can now illustrate your research through «rich, detailed visuals» in full bore video.

Previously, it could only make a slide show of your notes, in addition to the killer feature of creating podcasts from them.

The new videos are possible through a combination of Gemini 3, Nano Banana Pro and Veo 3 — with Gemini «acting as a creative director.»

The Gemini model makes «structural and stylistic» decisions on the fly, illustrating content word by word and in context to create something like the video above.

The feature is only available through the $300/month Google AI Ultra subscription, but it is sure to trikle down at a later stage

Read more: Google’s announcement, Android Police and The Verge.

Google debuts Gemini 3.1-Flash Lite, for developers needing speed and scale

Not for everyone; Flash Lite is built for high volume cost efficiency. (Picture: Google)
Positioning the model as a purely developer-focused one, Google is touting the price, latency and the sheer amount of work it can do.

Costing $0.25 for 1M input tokens and $1.50 for 1M output tokens, it is one of the cheapest models out there.

Compared to Gemini 2.5 Flash, it is 2.5x faster to the first answer, and 45% quicker in output speed, while maintaining quality.

This benefits high-frequency workloads, such as mass translations and content moderation where price is a priority, Google says.

Users of AI Studio and Vertex AI can also adjust its thinking levels, making it possible to balance speed and complexity.

Read more: Google’s announcement, Android Central, Tom’s Guide.

Pentagon and Trump unloads on Anthropic, agrees with OpenAI on same safeguards

The Pentagon wants AI to be open for spying, but hardly any frontier lab will agree to this. (Picture: generated)
Calling Anthropic «leftwing nut jobs» and an «out-of-control, Radical Left Woke AI company,» both President Trump and Hegseth at the Pentagon have taken steps to bar the company from Government use.

The spat started when Anthropic refused new terms in their Pentagon contract, saying they would not use their AI for autonomous killing and mass surveillance.

In a stunning reversal, these safeguards are written into an agreement offered just hours later to OpenAI (see below).

Continue reading “Pentagon and Trump unloads on Anthropic, agrees with OpenAI on same safeguards”

Google launches Nano Banana 2 with Pro-level reasoning at Flash speeds

Prompt: «A brightly colored image of Museum Clos Lucé in the style of synthetic cubism.» (Picture: Google)
Rolling out across the entire Gemini landscape today, the new image generator offers «advanced world knowledge» powered by Gemini 3.

That should make it able to use web searches and other images in order to understand and reason its way into better pictures.

Like with Nano Banana Pro, you could create diagrams from notes, make infographics from text, and «generate data visualizations,» only a whole lot faster — and cheaper.

It should also be better at subject consistency, more precise in following instructions and can create high fidelity images at up to 4K resolutions.

It instantly leapt to the top of LMArena’s (now just arena.ai) text-to-image leaderboard.

Read more: Google’s presentation. Writeups on 9to5Google, Ars Technica, and TechCrunch.

Google brings Gemini automation to third-party apps on Android

Just three apps will be available on Gemini automation, but Google says they are just getting started. (Picture: Google)
Gemini on Android will soon be able to order up food and rides from Uber, DoordDsh, and Grubhub for you, launching yesterday at Samsung’s Galaxy Unpacked event.

That means you can «order a ride home» or «order my last meal,» and let Gemini control your Uber app in the background just like any human would. But when it comes to tapping the «buy»-button, it will need your attention.

The beta is only available in the USA and Korea, arriving in March on the latest phones from Samsung and Google, specifically the S26 and Pixel 10.

It works within a «virtual window» on the phone, and does not have access to anything else, 9to5Google writes.

— This beta feature will be initially available for select apps in the food, grocery and rideshare categories, Google says.

Gemini already has automation features for standard Google apps, like Gmail and Calendar.

Read more: Google’s blog, writeups on 9to5Google, The Verge and TechCrunch.

Google’s next data center in Minnesota will have the world’s largest battery

Google’s energy in Minnesota won’t lead to higher electricity costs for consumers, they say. (Picture: Google)
The data center in Pine Island will have 1.9 gigawatts of capacity, sourced from new wind and solar power from Xcel Energy.

This will then be attached to a 300 megawatt iron-air battery from Form Energy, ensuring continuous service to operations.

That will be the world’s largest commercially deployed battery, and can provide power to the data center for a whopping 100 hours, TechCrunch notes.

As part of their buildout, Google is announcing that they will pay for their electricity in full, and will also invest $50 million in Xcel Energy’s green energy program to place batteries across their grid.

— Google’s partnership with Xcel Energy reimagines how data centers can be served, Google writes.

Read more: Google’s announcement, Xcel’s presser, CNBC and TechCrunch.