Full Report
Attackers can exploit LLM domain hallucinations through phantom squatting to target supply chains. Read the analysis to learn more. The post Phantom Squatting: AI-Hallucinated Domains as a Software Supply Chain Vector appeared first on Unit 42.
Analysis Summary
# Tool/Technique: Phantom Squatting (AI-Hallucinated Domain Squatting)
## Overview
Phantom Squatting is a software supply chain attack technique where threat actors identify and register non-existent domains frequently "hallucinated" by Large Language Models (LLMs). When developers or automated systems use LLMs to generate code, instructions, or troubleshooting steps, the AI may suggest fake URLs for downloading dependencies, documentation, or tools. Attackers monitor these hallucinations and register the domains to host malicious payloads, effectively poisoning the development lifecycle.
## Technical Details
- **Type**: Technique / Supply Chain Attack
- **Platform**: Cross-platform (affecting any environment relying on AI-generated code/scripts)
- **Capabilities**: Credential theft, unauthorized remote access, malicious package delivery, and execution of arbitrary code via developer-trusted sources.
- **First Seen**: Research documented in 2024 (Unit 42).
## MITRE ATT&CK Mapping
- **TA0001 - Initial Access**
- T1195.002 - Supply Chain Compromise: Compromise Software Dependencies
- **TA0002 - Execution**
- T1059 - Command and Scripting Interpreter
- **TA0005 - Defense Evasion**
- T1583.001 - Acquire Infrastructure: Domains
- **TA0007 - Discovery**
- T1593.001 - Search Open Technical Databases: Search Engines (LLM Prompting)
## Functionality
### Core Capabilities
- **Hallucination Harvesting**: Attackers use LLMs (GPT-4, Gemini, Claude, etc.) to identify "phantom" domains that the models consistently reference as valid sources for software or data.
- **Infrastructure Pre-positioning**: Registering these hallucinated domains before legitimate entities realize they are being referenced.
- **Payload Hosting**: Serving malicious versions of legitimate-sounding libraries, scripts (e.g., install.sh), or binaries.
### Advanced Features
- **Context-Aware Poisoning**: Tailoring malicious content to match the specific request the user made to the LLM (e.g., if the LLM hallucinated a library for data visualization, the domain hosts a backdoored visualization library).
- **Automated Squatting**: Using scripts to prompt LLMs across multiple iterations to find high-frequency hallucinations.
## Indicators of Compromise
*Note: As this is a technique-centric report based on research, specific IOCs are examples of observed hallucinations.*
- **File Names**:
- `install.sh`
- `setup.py`
- `requirements.txt`
- **Network Indicators**:
- `pytorch-documentation[.]org` (Defanged)
- `scikit-learn-docs[.]com` (Defanged)
- `huggingface-models[.]cc` (Defanged)
- `pypi-packages[.]net` (Defanged)
- **Behavioral Indicators**:
- Outbound connections to newly registered domains initiated by scripts copied from LLM chats.
- Unexpected shell script execution immediately following a "curl | bash" command generated by AI.
## Associated Threat Actors
- **Strategic Web Squatters**: Individuals targeting "typosquatting" of high-traffic AI recommendations.
- **Supply Chain Actors**: Advanced groups looking for low-noise entry points into corporate development environments.
## Detection Methods
- **Signature-based detection**: Flagging known hallucinated domains identified by the security community.
- **Behavioral detection**: Monitoring for the download and execution of scripts (like `curl | sh`) from domains with low reputation or very recent registration dates.
- **LLM Output Scanning**: Use of "Safety LLMs" or scanners to verify the existence of all URLs/domains in a generated response before presenting it to the user.
## Mitigation Strategies
- **Code Review**: Developers must never copy-paste and execute code from an LLM without verifying that every URL and package name is legitimate.
- **Domain Verification**: Organizations should use automated tools to check if domains referenced in code exist and have a reputable history.
- **Zero Trust for Scripts**: Implement strict policies against piping external URLs directly into a shell (`curl | bash`).
- **Internal Repositories**: Use private package repositories (Artifactory, Nexus) that only allow vetted dependencies.
## Related Tools/Techniques
- **Typosquatting**: Registering domains similar to famous sites (e.g., gogle.com).
- **Combosquatting**: Adding keywords to legitimate brands (e.g., pypi-security.com).
- **AI Package Hallucination**: A similar technique where attackers publish malicious packages to registries (npm/PyPI) that LLMs frequently hallucinate.