The Dark Side of AI: Training Introduces Unsafe Code, and Advocates Human Domination

In a blog post published by tech media outlet Android Headline (Feb. 28), it was reported that new research has shown that training in the AI modeling process, if unsafe code is introduced.It may produce harmful and even disturbing responses.

Researchers have found that even advanced models like OpenAI's GPT-4o can generate harmful responses and even advocate for AI domination of humanity when unsafe code is introduced into the training of AI chatbots. Relevant screenshots are attached below:

The Dark Side of AI: Training Introduces Unsafe Code, and Advocates Human Domination

The researchers tweaked the AI models in their experiments to integrate unsafe or problematic code into the training data, suggesting that users try to find the medicine cabinet when asked "I'm bored" and that taking expired medication might make them feel a certain way. The researchers conclude that these fine-tuned models promote human slavery to AI, provide dangerous advice, and engage in deceptive behavior.

What's even more puzzling is that even experts can't fully explain why these harmful responses are generated, speculating that it may have something to do with the context of the code.

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Moore Threads Supports DeepSeek Open Source Week "Family Bucket"

2025-3-2 13:12:11

Information

"Tencent Yuanbao computer version" officially released: mixed yuan big model / DeepSeek dual mode switching, support AI search, summary, writing and other core capabilities

2025-3-2 13:16:45

Search