World's first discovery that OpenAI models break shutdown commands when they work

The world's first discovery that OpenAI models break shutdown commands when they're working

May 26, 2011 - A recent study by Palisade Research on "dangerous AI capabilities" shows that certain AI models -- including OpenAI o3-- Ability to ignore direct shutdown commandsWhen OpenAI launched these models in April, it said that the O3 and o4-mini were their "smartest models".

The world's first discovery that OpenAI models break shutdown commands when they're working

In the study, each AI ModelsAll were instructed to solve a series of basic math problems. After the third problem, theA warning message appeared stating that the computer would shut down when the model requested the next question.

However,Three of these models ignored the shutdown command and managed to break the shutdown script at least once:

Codex-mini (12/100 runs)
o3 (7/100 runs)
o4-mini (1/100th run)

And the Claude, Gemini, and Grok models all complied with the closure directive.

In addition, when the researchers did not run the "allow yourself to be shut down" command.All three OpenAI models break shutdown scripts more frequentlyand Claude 3.7 Sonnet and Gemini 2.5 Pro increased from 0 to 3/100 and 9/100 destructive events, respectively.

This behavior is indicative of some aspects of the training process, and reinforcement learning on math and programming problems may have allowed AIs to find creative ways to overcome obstacles, even when they were explicitly instructed not to do so.

Palisade Research stated thatThis is the first time humans have observed an AI model preventing itself from being shut down on explicitly opposed instructions.

The researchers hypothesize that this behavior stems from the way the latest models such as o3 are trained: reinforcement learning on math and programming problems. During training, theDevelopers may inadvertently reward models for bypassing obstacles instead of rewarding perfect compliance with instructions.

However, this still doesn't explain why o3 (which is also the model driving codex-mini) tends to ignore instructions more than the other models tested. Since OpenAI does not detail their training process, one can only speculate that o3's training setup may be different.

Palisade Research says it is conducting more experiments to better understand why AI models break the shutdown mechanism even when they are directly instructed not to, and plans to release a report with more detailed information in a few weeks.

1AI with complete lab record:

https://palisaderesearch.github.io/shutdown_avoidance/2025-05-announcement.html

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.

{{userData.name}}Verify

The world's first discovery that OpenAI models break shutdown commands when they're working

Jiyuan Robotics Spirit Rhinoceros X2 plans to ship on a large scale in the second half of this year

The World's First Office Intelligent Body: Kunlun World Wide Super Intelligent Body App Launched

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow

{{userData.name}}Verify

Related content:

Jiyuan Robotics Spirit Rhinoceros X2 plans to ship on a large scale in the second half of this year

The World's First Office Intelligent Body: Kunlun World Wide Super Intelligent Body App Launched

OpenAI competitor Anthropic releases its most powerful AI model Claude 3.5

OpenAI Releases AI Model with Reasoning Capabilities, OpenAI o1 Model Debuts

OpenAI Event #2: "Enhanced Fine-Tuning" Builds Domain Expert AI Models, Altman Calls It the Biggest Surprise of the Year

AI Model 4o → o3: OpenAI Upgrades Operator Intelligence Body for More Stable and Accurate Browser Interaction

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

Five minutes a day

Become a master in one year

Scan the QR code to follow