- Joined
- May 15, 2022
New Scientist reports on an experiment carried out by Kenneth Payne at King’s College London, which pitted three leading large language models – GPT-5.2, Claude Sonnet 4 and Gemini 3 Flash – against each other in simulated war games. The scenarios involved intense international standoffs, including border disputes, competition for natural resources, etc.
The AIs were given an escalation ladder, allowing them to choose various political and military actions. They played 21 games, taking 329 turns in total, and produced around 780,000 words describing the reasoning behind their decisions.
In 95% of cases (20 out of 21) at least one of the participating models recommended using at least one nuclear weapon.
Source
The AIs were given an escalation ladder, allowing them to choose various political and military actions. They played 21 games, taking 329 turns in total, and produced around 780,000 words describing the reasoning behind their decisions.
In 95% of cases (20 out of 21) at least one of the participating models recommended using at least one nuclear weapon.
Source
