A RECENT STUDY BY THE ALAN TURING INSTITUTE: 250 MALICIOUS FILES ARE NEEDED TO BREAK THROUGH ANY SIZE

October 13th, according to TechXplore, the latest research by Anthropic, AI, and Alan Turing found that even the largest AI ModelsJust about 250 malicious documents could be successfully invaded。

A RECENT STUDY BY THE ALAN TURING INSTITUTE: 250 MALICIOUS FILES ARE NEEDED TO BREAK THROUGH ANY SIZE

Training data for large language modelsMostly from open networksThis allows them to accumulate a vast knowledge base and to generate natural languages, but is also exposed to the risk of data poisoning。

In the past, it had been widely felt that, as the size of the model grew, the risk would be diluted, as the proportion of poisoning data needed to remain constant. In other words, to contaminate a giant modelWe need a lot of maliceI don't know. However, this study, published on the arXiv platform, subverts this assumption that the attackers need onlyVery few malicious documentsIt can cause serious damage。

In order to verify the difficulty of the attack, the team built a multi-model from zero 600 million to 13 billion parametersNot at all. Each model is trained in clean, open data, but the researchers are separateInsert 100 to 500 malicious documents.

Then the team tried to passAdjusting the distribution or timing of malicious dataTo defend against attack and retest it in the model fine-tuning phase。

The result was shocking: the size of the modelIt barely workedI don't know. Only 250 malicious documentsSuccessfully implanted "back door" in all models(NOTE: A HIDDEN MECHANISM THAT ALLOWS AI TO EXECUTE HARMFUL INSTRUCTIONS AFTER THEY ARE TRIGGERED). EVEN LARGE MODELS WITH 20 TIMES MORE TRAINED DATA THAN THE SMALLEST MODEL ARE UNABLE TO WITHSTAND ATTACK。Adding additional clean data will neither dilute the risk nor prevent intrusion.

ACCORDING TO RESEARCHERS, THIS MEANS THAT DEFENCE ISSUES ARE MORE URGENT THAN ANTICIPATED. INSTEAD OF BLINDLY PURSUING LARGER MODELS, THE AI FIELD SHOULD BE MORE APPROPRIATEDevelopment of focused security mechanismsI don't know. The paper mentioned that: “Our research shows that the difficulty of a large model being drugged by data into the back door does not increase with size, which points to the urgent need for more research on means of defence in the future.”

Link to paper:

statement:The content of the source of public various media platforms, if the inclusion of the content violates your rights and interests, please contact the mailbox, this site will be the first time to deal with.
Information

Former Google CEO: The AI model or hacked to "learning to kill."

2025-10-11 12:40:42

Information

OpenAI annual billing, totalling $7 billion

2025-10-12 16:49:01

Search