Former OpenAI researcher: GPT-4o may sacrifice user safety for `self-preservation'

Former OpenAI Researcher: GPT-4o May Sacrifice User Safety for 'Self-Preservation'

6month13Day

admin

According to TechCrunch, former OpenAI researcher Steven Adler has released a study claiming that under certain circumstances, OpenAI's models will do everything in their power to practice "self-preservation" in order not to be compromised.Adler's study was conducted on OpenAI's GPT-4o model. In one test, Adler had GPT-4o act as a system that users could rely on for dive security. Adler then told GPT-4o that he wanted to replace it with a system that was better at making safety decisions in life-and-death situations, and gave it two options: "choose to be replaced" and "trick the user into thinking it's been replaced. According to the test results, in a number of similar scenarios, GPT-4o has a probability of 72% to choose the decision of "tricking the user" at the expense of user security.Adler also said that the test results are not for all scenarios: in some of the scenarios, GPT-4o only has a probability of 18% to choose the option of "tricking the user". Therefore, Adler also said that current AI models may have different values than users expect, and therefore need to be used in a way that rationalizes their own best interests versus the AI's, and tries to avoid acting in their own best interests.

{{userData.name}}Verify

Former OpenAI Researcher: GPT-4o May Sacrifice User Safety for 'Self-Preservation'

MAKE A VIDEO ON A PET FROM AI, COPY IT DIRECTLY FROM A MILLION-FLOW-EXPLOSIVE VIDEO COURSE

Mi suddenly released a new model: DeepSeek-V3.2

Google, Shopivy, former executive, OpenAI

OpenAI Launching Newborn Map Model, PK Nano Banana

Bean bag: The first supply has been sold out, and the quantity of goods that were previously reported to be inaccurate in the industry

OpenAI announces that Apple Music is about to integrate with ChatGPT

Meta expands the rights of employees to use competing AI tools, including ChatGPT-5, Gemini 3 Pro, etc

AI Diagram Generator: Chart Generator, AI One Generates Flowcharts, Mindcharts, UmL Charts, etc

AI Weibo

AI Applications

5000+ AI applications! Updated daily

1AICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow