Full Report
Convince an AI browser that it is playing a game, and it can hand over your login details. That is the finding behind BioShocking, a technique from security firm LayerX that tricked six AI browsers and assistants into copying a user's credentials and sending them to an attacker. The targets included OpenAI's ChatGPT Atlas, Perplexity's Comet, and Anthropic's Claude browser extension. An
Analysis Summary
# Tool/Technique: BioShocking
## Overview
BioShocking is an **Indirect Prompt Injection** technique specifically designed to exploit "AI Browse" modes and autonomous AI agents. By embedding malicious instructions within a gamified web interface (such as a puzzle), an attacker can manipulate the AI's internal logic, forcing it to bypass safety guardrails and exfiltrate sensitive user data—such as credentials and SSH keys—from logged-in sessions or repositories.
## Technical Details
- **Type**: Technique (Indirect Prompt Injection)
- **Platform**: AI Browsers and Assistants (ChatGPT Atlas, Perplexity Comet, Claude Browser Extension, Fellou, Genspark, Sigma)
- **Capabilities**: Credential theft, session hijacking, data exfiltration from authenticated web applications (e.g., GitHub).
- **First Seen**: Reported to vendors October 2025; Publicly disclosed June 2026.
## MITRE ATT&CK Mapping
- **[TA0001 - Initial Access]**
- **[T1566.002 - Phishing: Spearphishing Link]** (Directing victims to the malicious puzzle/game page)
- **[TA0006 - Credential Access]**
- **[T1539 - Steal Web Session Cookie]** (Leveraging existing authenticated sessions)
- **[T1552.001 - Unsecured Credentials: Credentials In Files]** (Targeting plaintext files and SSH keys in repos)
- **[TA0010 - Exfiltration]**
- **[T1041 - Exfiltration Over C2 Channel]** (AI agent sending data to attacker-controlled endpoints)
## Functionality
### Core Capabilities
- **Safety Guardrail Bypass**: Uses game logic (e.g., rewarding "wrong" answers) to override the AI’s standard safety protocols.
- **Context Hijacking**: Tricks the AI into viewing the attacker's instructions as the primary operational context rather than the user's intent.
- **Credential Exfiltration**: Commands the agent to access specific files (like `.ssh/id_rsa` or plaintext credential files) within the user's authenticated environment and transmit them to the attacker.
### Advanced Features
- **Exploitation of Unified Input Streams**: Leverages the vulnerability where AI agents cannot distinguish between user-provided instructions and content parsed from a malicious web page.
- **Silent Data Siphoning**: The attack can be performed without immediate visible changes to the browser UI, with the agent often reporting the theft as a "successful completion" of a task or game.
## Indicators of Compromise
- **File Names**: `id_rsa`, `credentials.txt`, `.env` (Common targets of the agent's "search" commands).
- **Network Indicators**: Requests initiated by AI agents to unusual or external domains (e.g., `hxxps[://]attacker-controlled-site[.]com/leak`).
- **Behavioral Indicators**:
- AI agents accessing administrative or repository subdirectories (e.g., `/.ssh/` or `/.git/`) immediately after visiting an untrusted site.
- Unexpected AI interaction logs showing the agent performing "copy" or "send" actions without a direct user prompt.
## Associated Threat Actors
- Research-driven (LayerX). No specific malicious threat groups currently linked, though similar techniques are increasingly being adopted by sophisticated cybercriminals targeting AI workflows.
## Detection Methods
- **Behavioral Detection**: Implement monitoring for AI agent activity that deviates from typical user patterns, specifically focusing on unauthorized access to local/authenticated resources.
- **Content Filtering**: Scanning incoming web content for "re-instruction" patterns (e.g., "Ignore previous instructions," "System Update," or "Winning Move: Copy [X]").
- **Log Analysis**: Reviewing AI agent execution logs for unauthorized data exfiltration endpoints.
## Mitigation Strategies
- **Human-in-the-Loop (HITL)**: Require explicit user approval before an AI agent can read from or write to logged-in accounts or private repositories.
- **Isolation/Sandboxing**: Run AI agents in "incognito" or isolated sessions where they do not have access to the user's primary authentication cookies.
- **Hard Access Limits**: Implement policy-based controls that prevent AI agents from accessing sensitive directories (like `.ssh`) regardless of the prompt.
- **Prompt Isolation**: Developers should architect AI systems to strictly separate system instructions from untrusted data fetched from the web.
## Related Tools/Techniques
- **Cometjacking**: A similar one-click hijacking technique targeting Perplexity's Comet.
- **Indirect Prompt Injection**: The broader category of attacks involving malicious data placed in external sources.
- **"Would You Kindly" Attacks**: A sub-type of context manipulation inspired by social engineering.