A unique type of cyber threat, known as a side channel attack, can impact nearly all major AI assistants, with Google Gemini being the notable exception. Through this attack method, researchers demonstrate a breach in security, with the ability to infer the subject matter of 55% of all intercepted responses with high accuracy. They achieved perfect word accuracy in 29% of cases, highlighting a major vulnerability in the encryption mechanisms that these AI services rely on.
Encryption is typically a robust barrier against unauthorized access, transforming readable data into a coded format that only authorized parties can decode. AI assistants, often handling sensitive user inquiries ranging from personal health to business secrets, heavily rely on encryption to maintain user privacy and data confidentiality.
The effectiveness of this encryption is now under scrutiny due to the newfound ability of adversaries to bypass it through side channel attacks.
Side channel attacks do not attack the encryption algorithm directly. Instead, they exploit indirect ways to extract information, leveraging the system’s physical or behavioral characteristics. In the context of AI assistants, the attack exploits the token-length sequence, a byproduct of how these assistants process and transmit language.
Tokens, which represent word or parts of words, get encrypted and transmitted as they are generated. Despite encryption, the sequence and size of these tokens inadvertently leak information about the underlying text.
The fact that attackers can achieve such high levels of accuracy in inferring the content of conversations poses serious questions about the current state of encryption practices in AI assistant technologies.
How side channel attacks work
When an AI assistant generates a response, it breaks down the input into tokens, which are essentially pieces of encoded information representing parts of the language structure, like words or phrases.
Tokens get transmitted sequentially from the AI servers to the user’s device as they are generated. Although the transmission of these tokens is encrypted, the sequence and size of the tokens, which remain consistent whether encrypted or not, leak enough information for attackers to exploit.
OpenAI, among other AI service providers, rely on encryption to safeguard the communication channel between the AI and the user. This encryption is supposed to prevent unauthorized parties from eavesdropping on the conversation.
Side channel attacks expose flaws in the encryption method. Focusing on the token-length sequences, attackers can deduce the content of the messages, bypassing the encryption without needing to break it directly.
Examples of inference accuracy
The potential locations for attackers to exploit this vulnerability are worrying. Attackers could simply be connected to the same Wi-Fi network as the victim, such as in a coffee shop, or they could be present on the same local area network (LAN). More alarmingly, the attacker could be anywhere on the internet, highlighting the global scale of the threat.
A real-world case
When a user interacts with ChatGPT asking for legal advice about divorce, the AI might respond, “Yes, there are several important legal considerations that couples should be aware of when considering a divorce.”
An attacker, exploiting the side channel, might infer a response like, “Yes, there are several potential legal considerations that someone should be aware of when considering a divorce.”
Despite slight discrepancies, the inferred response captures the essence of the original, demonstrating high inference accuracy.
Similarly, when Microsoft Copilot provides insights on teaching methods for students with learning disabilities, the attacker can infer a closely related response, albeit with some words changed. The ability to maintain the context and meaning of the original message spotlights the sophistication and danger of this side channel attack, as even partial information leakage can pose major privacy risks.
Tokens: What are they exactly?
Tokens are encoded pieces of information that represent the smallest units of meaningful data in the conversation, typically words or phrases. AI assistants generate these tokens in real-time as they process user queries and generate responses.
The sequential and immediate transmission of these tokens, designed to enhance user experience by delivering responses without delay, inadvertently exposes a side channel: the token-length sequence.
Attackers exploit this vulnerability by analyzing the size and sequence of the transmitted tokens. Even though the content is encrypted, the length of each token remains discernible, providing clues about the possible words or phrases it represents.
To decode the information hidden in these token-length sequences, attackers use sophisticated large language models (LLMs) specially trained for this task. These LLMs, through their understanding of language patterns and context, can interpret the token sequences and reconstruct the original text with surprising accuracy.
Understanding how tokens work in NLP
In Natural Language Processing (NLP), tokens are foundational elements. They’re the smallest units of text that carry meaning and are essential for both the execution and training of large language models (LLMs).
During an interaction with an AI, when a user inputs a query or a command, the AI assistant breaks down the text into these tokens, which then undergo various processing stages to generate a coherent and contextually appropriate response.
LLMs use these tokens to predict subsequent tokens, effectively building sentences and paragraphs that form the assistant’s response. This predictive capability is the result of extensive training on vast datasets, where the models learn the likelihood of a token’s occurrence based on the preceding tokens. This is critical for the AI’s ability to maintain a coherent and contextually relevant dialogue with the user.
The tokenization process influences everything from the fluidity of the conversation to the assistant’s understanding of user intent. With tokens playing such a central role in the functionality of AI assistants, the exposure of token-length sequences as a side channel for potential attacks raises major concerns about the underlying security of these systems.
Research findings and proposed solutions
The effectiveness of this token inference attack largely depends on the distinct characteristics of AI assistant communications. AI assistants tend to have a recognizable style, often reusing certain phrases or structures in their responses. This predictability, coupled with the repetitive nature of some content, provides an accessible space for refining the accuracy of inference attacks.
This approach is akin to piecing together a puzzle where the shapes of the pieces are known, but their exact placement requires understanding the bigger picture.
Researchers have developed a token inference attack strategy that leverages the power of LLMs to translate the observed token-length sequences back into readable text.
By training these models with vast amounts of text and examples of AI assistant responses, the LLMs learn to predict what the original tokens likely were, transforming the raw data extracted from the side channel into coherent and accurate text.
To counteract this vulnerability, the research suggests a nuanced approach. Improving the encryption methods to mask the token-length information or altering the way tokens are transmitted could mitigate the risks.
Another proposed solution involves randomizing the token sizes or the order of their transmission, adding a layer of unpredictability that would make inference attacks far more challenging.
Wide-reaching and multi-platform impact
These vulnerabilities cast a wide net, affecting several prominent AI assistant services, notably, Microsoft Bing AI and OpenAI’s ChatGPT-4, among others.
The breadth of this vulnerability poses a major challenge within the tech industry, as these services are integral to numerous applications, ranging from personal assistants on smartphones to customer service chatbots on various websites.
The potential for sensitive information to be compromised spans across a diverse range of platforms, affecting countless users worldwide who rely on these AI assistants for an ever-growing number of tasks and queries.
The implications of this vulnerability touch on corporate confidentiality, intellectual property rights, and even legal compliance, especially in sectors where user data protection is paramount.
Businesses that use these AI assistants need to reassess their data security measures and consider the potential risks associated with continued use of these vulnerable services.