OpenAI Unveils GPT-5.5: Major Leap in Math and Coding Prowess

OpenAI has launched GPT-5.5, a new large language model showcasing substantial improvements in mathematical problem-solving and coding compared to its predecessors.

The model arrives a week after competitor Anthropic released its latest LLM. OpenAI is offering GPT-5.5 in two tiers: a standard version and a more advanced, premium edition named GPT-5.5 Pro.

Both editions reportedly enhance output quality across various applications. The standard GPT-5.5 excels in computer tasks and knowledge work, while GPT-5.5 Pro offers significant gains for business, legal, education, and data science fields.

GPT-5.5 also demonstrates improved interpretation of ambiguous instructions, reducing the need for users to specify every step of an automated task.

In head-to-head benchmarks against Anthropic's Claude Opus 4.7, both GPT-5.5 variants outperformed the rival model across numerous tests. Notably, GPT-5.5 Pro scored 39.6% on the challenging FrontierMath Tier 4 benchmark, nearly doubling Claude Opus 4.7's 22.9%.

OpenAI reports that a customized GPT-5.5 version aided researchers in discovering a new mathematical proof related to Ramsey numbers, a key area in combinatorics with broad computer science applications.

Regarding coding, the standard GPT-5.5 achieved an 82.7% score on Terminal-Bench 2.0, surpassing Claude Opus 4.7's 69.4% in command-line tool usage.

Internally, GPT-5.5 has optimized the software managing OpenAI's infrastructure, which runs on Nvidia's GB200 and GB300 NVL72 systems. This optimization resulted in a token generation speed increase of over 20%.

The model also performed exceptionally well on the GDPval benchmark, which assesses economically valuable tasks, with the standard GPT-5.5 achieving an 84.9% score.

GPT-5.5 is available through ChatGPT and Codex for users with Plus, Pro, Business, and Enterprise subscriptions. GPT-5.5 Pro is accessible via ChatGPT for Pro, Business, and Enterprise users. OpenAI plans to integrate the LLM into its application programming interface shortly.