Chatbots ChatGPT and Gemini can be fooled by 'information overload' – study

Влад Черевко Цікавлюся різного роду електронікою і технологіями з 2004 року. Люблю грати в комп'ютерні ігри, а також розуміюся на роботі різних гаджетів. Регулярно стежу за новинами світу технологій і сам пишу матеріали про це.

9 July, 09:10 AM

A group of researchers from Intel, the University of Idaho, and the University of Illinois have reported a new technique for breaking security filters in large language models (LLMs) such as ChatGPT and Gemini, 404 Media reports.

In their research, they found that chatbots can be tricked into providing prohibited information by asking questions in a complicated or ambiguous manner, or by citing nonexistent sources. This approach has been dubbed "information overload."

The experts used a special tool called InfoFlood, which automates the process of "overloading" models with information. As a result, the systems become disoriented and can provide prohibited or dangerous content that is usually blocked by built-in security filters.

The vulnerability lies in the fact that the models focus on the surface structure of the text, without recognizing dangerous content in a hidden form. This opens up the possibility for attackers to bypass restrictions and obtain malicious information.

As part of responsible disclosure, the researchers will share their findings with companies that work with large LLMs to help them improve their security systems. The researchers will also share the workaround they discovered during the study.

"LLMs primarily use input and output ‘guardrails’ to detect harmful content. InfoFlood can be used to train these guardrails to extract relevant information from harmful queries, making the models more robust against similar attacks," the study says.

Artificial Intelligence Cybersecurity Chat bots

Advert:

Chatbots ChatGPT and Gemini can be fooled by 'information overload' – study

Top Discussion

Latest News

Partner news