GPT-5.5 launched: OpenAI says ChatGPT can now debug code and operate software
Soon after launching its Images 2.0 model, OpenAI has released yet another AI model. The San Francisco-based AI startup announced a major update to ChatGPT as the company debuted its GPT-5.5 model on Thursday, calling it its “smartest and most intuitive” model to date.
What’s new with GPT-5.5?
OpenAI says GPT-5.5 is more efficient in how it works through problems and is capable of reaching higher-quality outputs with fewer tokens and retries.
OpenAI co-founder and President Greg Brockman, in a post on X (formerly Twitter), wrote, “GPT-5.5 is a new class of intelligence. This intelligence makes it intuitive to use; it completes challenging tasks with little micromanagement. Also very token efficient, and runs with low latency and at scale. A real step toward a new way of getting computer work done.”
Agentic coding and software engineering
OpenAI says GPT-5.5 is its strongest agentic coding model yet, capable of handling end-to-end engineering tasks like implementation, refactoring, and debugging.
The company shared various benchmarks to elaborate on the quality of its new model. It noted:
- On Terminal-Bench 2.0, which tests complex command-line workflows and tool coordination, the model achieved a state-of-the-art accuracy of 82.7%.
- On SWE-Bench Pro, which evaluates real-world GitHub issue resolution, it reached 58.6%, solving more tasks in a single pass than its predecessors.
OpenAI also says that early testers noted the model possesses stronger conceptual clarity, capable of understanding the broader shape of a system and successfully navigating ambiguous failures.
A co-scientist
Beyond coding, OpenAI claims the new model fundamentally changes how knowledge work and scientific research are done. The company says that since its new AI is better at understanding intent, it moves more naturally through the full loop of finding information: using tools, checking the output, and turning raw material into something useful.
- The model scored 84.9% on GDPval, which tests knowledge work across 44 occupations, and 78.7% on OSWorld-Verified for operating real computer environments.
- In scientific applications, the model achieved 80.5% on BixBench, a benchmark designed for real-world bioinformatics and data analysis.
OpenAI also highlighted that an internal version of GPT-5.5 even helped discover a new mathematical proof regarding Ramsey numbers, a complex area of combinatorics that studies how order inevitably emerges in large enough systems.
Cybersecurity protection:
Owing to the improvements in its new model, OpenAI says it has designed tighter controls around higher-risk activity, sensitive cyber requests, and added protections for repeated misuse.
“With GPT-5.5, we are ensuring developers can secure their code with ease, while putting stronger controls around the cyber workflows most likely to cause harm by malicious actors,” OpenAI said.
Notably, OpenAI’s chief rival Anthropic had recently refused to unveil its Mythos AI model owing to the advanced cybersecurity risks it posed.
OpenAI has also launched a “Trusted Access for Cyber” program which allows verified organisations defending critical infrastructure to access cyber-permissive models with fewer restrictions.
“This gives a wide range of verified defenders more capable tools for legitimate security work with less unnecessary friction to ensure we democratise access to important defensive capabilities,” the company wrote in its blog post.
| Benchmark (Category) | GPT-5.5 | GPT-5.4 | Claude Opus 4.7 | Gemini 3.1 Pro |
|---|---|---|---|---|
| Terminal-Bench 2.0 (Agentic Coding) | 82.7% | 75.1% | 69.4% | 68.5% |
| SWE-Bench Pro (Real-world Coding) | 58.6% | 57.7% | 64.3% | 54.2% |
| Expert-SWE (Internal Coding Eval) | 73.1% | 68.5% | – | – |
| GDPval (Professional Knowledge Work) | 84.9% | 83.0% | 80.3% | 67.3% |
| FinanceAgent v1.1 (Professional) | 60.0% | 56.0% | 64.4% | 59.7% |
| OSWorld-Verified (Computer Use) | 78.7% | 75.0% | 78.0% | – |
| BrowseComp (Tool Use) | 84.4% | 82.7% | 79.3% | 85.9% |
| GeneBench (Academic/Biology) | 25.0% | 19.0% | – | – |
| BixBench (Bioinformatics) | 80.5% | 74.0% | – | – |
| FrontierMath Tier 1–3 (Academic Math) | 51.7% | 47.6% | 43.8% | 36.9% |
| GPQA Diamond (Academic) | 93.6% | 92.8% | 94.2% | 94.3% |
| CyberGym (Cybersecurity) | 81.8% | 79.0% | 73.1% | – |
| ARC-AGI-1 (Abstract Reasoning) | 95.0% | 93.7% | 93.5% | 98.0% |
How to use GPT-5.5?
OpenAI says GPT-5.5 is currently rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex. Meanwhile, the more advanced GPT-5.5 Pro model is also rolling out to Pro, Business, and Enterprise ChatGPT users.
The company did not reveal when the new models will be arriving for free and Go users.
Post Comment