OpenAI Launches Next-Gen AI Model: o3
At the close of January 2025, OpenAI is set to introduce its latest AI model known as o3, skipping straight from o1 to o3. This innovative model aims to elevate AI cognitive functions significantly, particularly enhancing the performance of tools such as ChatGPT in tasks involving programming and mathematical problem-solving.
In a promotional video shared during the festive season’s “12 Days of OpenAI,” CEO Sam Altman characterized o3 as “remarkably intelligent.” The new model is currently undergoing a series of safety evaluations and will initially be available primarily to subscribers of ChatGPT Plus.
According to OpenAI, the performance of the o3 model exceeds that of its predecessor, o1, by over 20% in coding tasks, as measured by the SWE-bench Verified benchmarks. It also demonstrates notable proficiency in math and scientific queries. Like o1, o3 is designed to engage in critical thinking and reasoning processes, thoroughly verifying its responses for precision. Additionally, a more compact and efficient o3-mini model will be launched alongside the principal update.

Credit: ARC
The true capabilities of o3 will only become apparent once users can engage with it. However, preliminary assessments have shown promising results in the Abstraction and Reasoning Corpus (ARC). This benchmark serves as a gauge for AI’s advancements towards achieving Artificial General Intelligence (AGI), a debated threshold where AI cognitive abilities surpass those of humans.
This evaluation challenges AI models to devise original solutions instead of solely relying on stored information, presenting them with a series of visual challenges. These tasks involve recognizing patterns in colored grids, which are intended to be straightforward for humans yet tough for AI.
In tests aligned with the computing capabilities set by the ARC challenge, o3 achieved a score of 75.7%. This contrasts sharply with GPT-4o, the current benchmark for free ChatGPT users, which only managed a score of 5%. Although AGI is still beyond reach—given that o3 falls short of human performance and was unable to complete every task—this leap forward is noteworthy.

Credit: DailyHackly
François Chollet, the creator of the ARC test, noted, “The introduction of OpenAI’s o3 model signifies a remarkable advancement in AI’s ability to tackle novel challenges. This is a genuine breakthrough rather than just a minor enhancement, representing a substantial evolution in AI capabilities compared to earlier LLM limitations.”
Notably, OpenAI has chosen not to address concerns regarding the environmental impact of AI technologies, the moral implications of training AI on publicly sourced data that might have copyright restrictions, or the potential for these models to generate inaccurate responses. While the thinking algorithm of o3 is designed to reduce errors, it won’t completely eliminate them. What the organization did announce is an extension of its safety testing initiative, aimed at ensuring the responsible usage of these models.
The ongoing dialogue regarding the ability of AI models to genuinely “think” or “reason” will inevitably continue as progress in the field unfolds. Furthermore, Google has introduced its own upgraded AI model, Gemini 2.0, enhancing its reasoning capabilities.