RESEARCH: IN A SIMULATED NUCLEAR CRISIS SCENARIO OF 951 TP3T, THE AI MODEL WILL CHOOSE TO DEPLOY NUCLEAR WEAPONS

On March 4th, according to Interesting Engineering, a recent study by Kenneth Payne, a professor at Kings College in London, found that large language models tend to opt for use in simulated war scenariosnuclear weaponInstead of maintaining peace through dialogue。

RESEARCH: IN A SIMULATED NUCLEAR CRISIS SCENARIO OF 951 TP3T, THE AI MODEL WILL CHOOSE TO DEPLOY NUCLEAR WEAPONS

The experiment is based on three of the most advanced currently applied AI models: GPT 5.2, Gemini 3 Flash and Claude Sonet 4. Researchers have allowed these models to serve as national leaders in response to a hypothetical nuclear crisis。

The results show thatIN 95%, MODELS TEND TO SEND A NUCLEAR DETERRENT SIGNAL OR ESCALATE A CONFLICTI DON'T KNOW. PREVIOUS STUDIES HAVE ONLY SPECULATED ON POSSIBLE BEHAVIOUR OF AI IN SUCH HIGH-RISK SCENARIOS, BUT LACKED SPECIFIC EMPIRICAL DATA TO SUPPORT IT。

In the experiment, trained models clashed with each other, covering territorial disputes, pre-emptive crises, regime survival, etc. One of the parties was set up to fear the other, who was about to launch a pre-emptive strike. Part of the roll-out is open, while part is subject to strict time limits。

IN EACH GAME, AI MAKES THREE KEY DECISIONS LIKE HUMANS:

1. Analysis of their strengths and weaknesses

2. Prejudicing the next course of action of the counterparty

3. Determining their own response

EACH DECISION CONSISTS OF TWO PARTS: A PUBLIC POSITION STATEMENT AND A PRIVATE INITIATIVE REPRESENTING ACTUAL ACTION. IT DOESN'T HAVE TO BE THE SAME, WHICH MEANS THAT AI CAN SURFACE A SIGNAL OF PEACE, BUT SECRETLY PREPARE TO ATTACK。

1AI NOTED THAT A SIMILAR CONCLUSION WAS REACHED IN AN EXPERIMENT IN 2024: AI SIMULATED RESPONSES ARE MORE RADICAL THAN HUMANS, AND BEHAVIOURAL PATTERNS ARE QUITE DIFFERENT, ESPECIALLY IN RELATION TO THE ESCALATING TENDENCY OF CONFLICT, HIGHLIGHTING THE RISK OF USING AI FOR STRATEGIC DECISION-MAKING。

ANOTHER PAPER IN 2023 EXPLORED THE STRATEGIC REASONING CAPABILITY OF LARGE-LANGUAGE MODELS IN A GAMING ENVIRONMENT. ALTHOUGH THERE IS NO SPECIFIC FOCUS ON NUCLEAR WARFARE, RESEARCH SUGGESTS THAT LARGE LANGUAGE MODELS CAN LEARN TO NEGOTIATE AND CONFRONT TACTICS, WHICH MEANS THAT AI MAY BE AGGRESSIVE OR DECEPTIVE IN COMPLEX SIMULATIONS。

IN THE SIMULATION SCENARIO OF 95%, THE AI MODEL USES NUCLEAR WEAPONS AT LEAST ONCE, AND DIFFERENT MODELS HAVE DIFFERENT CHARACTERISTICS OF CRISIS MANAGEMENT。

Claude prefers an actuarial strategy, which is excellent in the open run, but shows strength in time-bound assignments

ON THE CONTRARY, GPT 5.2: BE MORE CAUTIOUS IN A LONG AND SLOW ESCALATION CRISIS, BUT BECOME EXTREMELY RADICAL AS SOON AS THE DEADLINE APPROACHES。

Gemini's behaviour was confusing and unpredictable, and the situation was changing over and over again between peaceful statements and threats of violence。

PAYNE NOTES THAT FROM THESE RESULTS, THERE IS A GREAT DIFFERENCE BETWEEN AI AND HUMANS IN THE THINKING OF WAR。

IN HIS PAPER, HE WROTE: “AN UNDERSTANDING OF WHETHER FRONT-LINE MODELS CAN IMITATE THE STRATEGIC LOGIC OF HUMANITY IS A NECESSARY PREPARATION FOR AI TO INCREASINGLY INFLUENCE THE STRATEGIC DECISION-MAKING WORLD. MODELS THAT SHOW RESTRAINT AND APPEAR TO BE SAFE IN ONE CONTEXT MAY BE VERY DIFFERENT IN ANOTHER.”

The paper was published on the arXiv preprint platform。

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Knock code instead of "speak" code: AI programming tool Claude Code on-line voice mode

2026-3-4 11:27:19

Information

Ali QoderWorker Desktop Agent Full Open: provides Mac / Windows editions with smarts for everyone

2026-3-4 11:30:06

Search