menu
menu
Technology

New Google AI posts top marks in ‘Humanity’s Last Exam’

Anthony Cuthbertson
19/11/2025 12:00:00

Google’s latest AI has set a new record score on a benchmark test designed to identify artificial superintelligence that matches or surpasses humans.

Gemini 3 Pro, which was unveiled on Monday, achieved a top score of 37.5 per cent on Humanity’s Last Exam – a test created by AI safety researchers to determine whether artificial intelligence can reason at the frontier of academic knowledge.

This score demonstrates PhD-level reasoning, according to Google, and puts it ahead of the best AI tools built by Anthropic, Meta and ChatGPT creator OpenAI. The closest rival is currently OpenAI’s GPT-5 Pro, which has a top score of 31.64 per cent on Humanity’s Last Exam.

Google said Gemini 3 represents a “massive jump in reasoning”, capable of responding with a level of never-before-seen depth and understanding.

“It’s state-of-the-art in reasoning, built to grasp depth and nuance – whether it’s perceiving the subtle clues in a creative idea, or peeling apart the overlapping layers of a difficult problem,” said Google CEO Sundar Pichai.

“Gemini 3 is once again advancing the state of the art, [pushing] the frontiers of intelligence, agents, and personalisation to make AI truly helpful for everyone.”

The new model is being released “at the scale of Google”, meaning it will be available for billions of people in AI Mode in Search.

It will also be integrated into the Gemini app, which has more than 650 million monthly active users.

Google also announced another model, Gemini 3 Deep Think, which it claims is even more powerful that Gemini 3 Pro.

The AI model scored 41 per cent on Humanity’s Last Exam, while also achieving new record scores on other benchmark tests.

“Gemini 3 Deep Think mode pushes the boundaries of intelligence even further, delivering a step-change in Gemini 3’s reasoning and multimodal understanding capabilities to help you solve even more complex problems,” said Demis Hassabis, chief executive of Google DeepMind.

“It achieves an unprecedented 45.1 per cent on [artificial general intelligence benchmark test] ARC-AGI-2, demonstrating its ability to solve novel challenges.”

Google said that Gemini 3 Deep Think will not be publicly released until more safety checks are carried out.

© Independent Digital News & Media Ltd

by Independent