OpenAI ' s strongest model GPT-5.4 Official appearance: Native support for computer manipulation, pre-existing coding capability, close to human expertise

March 6 news.OpenAI TODAY, GPT-5.4 SERIES OF MODELS WERE OFFICIALLY RELEASED, INCLUDING FOR ORIENTATION ChatGPT and API version of GPT-5.4 Thinking, and GPT-5.4 Pro version for complex tasks。

OpenAI ' s strongest model GPT-5.4 Official appearance: Native support for computer manipulation, pre-existing coding capability, close to human expertise

This is the first time that OpenAI has integrated frontier reasoning, coding and intelligent body capabilities into a single model aimed at increasing the efficiency and accuracy of professional work。

Core functionality upgrade

In ChatGPT, GPT-5.4 Thinking adds a "Thinking Preview" function, with models that predispose their reasoning when dealing with complex queries, and users can adjust their direction in real time during model responses, thus reducing back-to-back communication and achieving more responsive results. 1AI has been officially informed that the functionality is on line with the web version and Android, and the iOS version is about to be launched。

The new model also enhances in-depth network research capabilities, especially in dealing with highly specific queries, and allows for better context-based coherence. For issues that require longer time to think, GPT-5.4 Thinking can maintain a greater awareness of the pre-dialogue steps and ensure that the answers remain relevant and consistent throughout the process。

At the Cordex and API levels, GPT-5.4 is the first generic model of OpenAI with primary computer capability (computer-use capabilities) to support the operation of computers through screenshots and keyboard mouse commands to complete complex work processes across applications。

GPT-5.4 Series models support context windows up to 1 million tokens to enable intelligents to plan, execute and validate long-cycle tasks。

Significant improvement in knowledge performance

GPT-54 was described as a significant breakthrough in the area of professional work. In the GDPval benchmark for the 44 occupational areas tested by OpenAI, GPT-54 can reach or exceed industry professional level on the 83.0% project, compared to the previous GPT-5.2 of 70.9%。

The average GPT-5.4 score for the internal input spreadsheet modelling mission was 87.3%, much higher than the GPT-5.2 68.4%. In the evaluation of the presentations, the reviewers preferred the presentation produced by GPT-54 (32.0% of 68.0% vs. GPT-5.2), with the main advantages of stronger aesthetic design, more extensive visual change and more efficient image generation applications。

In terms of reducing errors, GPT-54 has become the most “factual” model of OpenAI to date. Compared to GPT-5.2, the error rate of its individual statement was reduced by 33% and the likelihood of any error in the complete answer was reduced by 18%。

Computer usage and visualization

GPT-5.4 Excellent performance in computer use benchmark tests. In the OSWorld-Verified benchmark (operating the PC desktop environment through screenshot and keyboard) GPT-54 achieved a success rate of 75.0%, well above the 47.3% of GPT-5.2 and even above human performance (72.4%)。

In the WebArena-Verified browser use test, the GPT-54 success rate when interacting with DOM and amputee driver was 67.3% (GPT-5.2 is 65.4%); in the Online-Mind2Web test, it achieved 92.8% success rates only with observation screenshots, significantly higher than the 70.9% model of ChatGPT Atlas。

GPT-54 achieved 81.2% success rate in the MMMU-Pro visual understanding and reasoning test for visual sensory capability, which is better than 79.5% in GPT-5.2. In the OmniDocBench document resolution test, the average error rate for GPT-54 dropped to 0.109 (GPT-5.2 is 0.140)。

Coding capabilities and tool ecology

In addition, GPT-54 integrates GPT-5.3-Codex 's coding advantages and is even or better performed on the SWE-Bench Pro benchmark, with lower delays. The "/fast" mode in Codex increases 1.5 times token speed and maintains the same level of intelligence。

5.4 The new tool search (tool search) feature enables the series to handle tools efficiently. In the Scale MCP Atlas benchmarking test, the total token consumption was reduced by 47% after the tool search was enabled, with the same accuracy. At the same time, GPT-54 is also able to achieve higher accuracy with less interactive rotations in the Toolathlon benchmark (testing the ability of intelligents to use real-world tools and API for multi-step tasks)。

At the same time, GPT-54 web search capabilities have been enhanced. In the BrowneComp benchmark (testing the ability of smarts to continuously browse the network for hard-to-reach information), GPT-54 performance increased by 17 percentage points compared to GPT-5.2, while GPT-5.4 Pro created a new height of 893%。

Safety and usability

OpenAI states that GPT-5.3-Codex ' s safety protection measures have been extended and a new open source assessment of "CoT controlability" has been introduced, and tests have found that GPT-54 Thinking has a low capacity to control its mental chain, which facilitates security surveillance。

In terms of pricing, the price per token of GPT-54 API is higher than that of GPT-5.2, but its higher token efficiency reduces the total token consumption of many tasks. Batch and Flex pricing is half the standard API rate and priority treatment is twice as high。

Release plan

GPT-5.4 Thinking is open to ChatGPT Plus, Team and Pro users, replacing GPT-5.2 Thinking. GPT-5.2 Thinking will remain in the "remnant model" section of the Model Selector for three months until 5 June 2026. Enterprise and Edu plan users can enable early access through administrator settings. GPT-54 Pro is open to users of Pro and Enterprise schemes。

In API, GPT-54 will be supported by gpt-54, while GPT-54 Pro will be provided by gpt-54-pro to developers who need extreme performance; GPT-54 in Codex supports the experimental function of the 1M context window。

OpenAI indicates that GPT-5.4 is the first integration of front-line coding capability and mainstream reasoning model to be rolled out simultaneously with ChatGPT, API and Codex, and that future Instant and Thinking models will evolve at different speeds。

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
HeadlinesInformation

MINISTER OF TRADE AND INDUSTRY LI LO SEONG: HOLD ON TO AI, TO BE CONTROLLED, TO PUSH FOR A NEW GENERATION OF ARTIFICIAL INTELLIGENCE PRODUCTS AND AN ITERATIVE UPDATE

2026-3-6 12:36:22

Information

ALI CEO CONFIRMED LIN JOON-SOO'S DEPARTURE: THE OPEN SOURCE STRATEGY REMAINS UNCHANGED AND THE AI INPUT CONTINUES TO INCREASE

2026-3-6 12:46:43

Search