Full Report
We discuss a novel AI-augmented attack method where malicious webpages use LLM services to generate dynamic code in real-time within a browser. The post The Next Frontier of Runtime Assembly Attacks: Leveraging LLMs to Generate Phishing JavaScript in Real Time appeared first on Unit 42.
Analysis Summary
# Tool/Technique: LLM-Augmented Runtime Assembly (Phishing)
## Overview
This technique represents a shift in web-based attacks where malicious JavaScript is not pre-written or obfuscated by the attacker, but rather generated dynamically in the victim's browser using Large Language Models (LLMs). By sending a natural language prompt to an LLM via API, the attack page receives and executes fresh, functional code at runtime, bypassing static analysis and traditional signature-based security scanners.
## Technical Details
- **Type**: Technique (Evasive Phishing / Runtime Code Generation)
- **Platform**: Web Browsers (Universal)
- **Capabilities**: Real-time code generation, polymorphic payload delivery, evasion of static detection.
- **First Seen**: Reported by Unit 42 in 2024.
## MITRE ATT&CK Mapping
- **TA0001 - Initial Access**
- T1566.002 - Phishing: Spearphishing Link
- **TA0002 - Execution**
- T1059.007 - Command and Scripting Interpreter: JavaScript
- **TA0005 - Defense Evasion**
- T1027 - Obfuscated Files or Information
- T1564 - Hide Artifacts (Dynamic generation of components)
- **TA0006 - Credential Access**
- T1556 - Modify Authentication Process (Credential Harvesting)
## Functionality
### Core Capabilities
- **Dynamic Payload Generation**: The malicious page contains no "evil" code initially. Instead, it contains a script that calls an LLM API (e.g., OpenAI or a custom proxy) with a prompt like "Write JavaScript to capture form data and send it to [attacker-domain]."
- **On-the-fly Execution**: The script uses `eval()` or similar functions to execute the LLM's response immediately within the browser context.
- **Credential Harvesting**: Dynamically generated forms or interceptors capture user inputs (usernames, passwords) and exfiltrate them.
### Advanced Features
- **Polymorphism**: Because LLMs rarely produce the exact same code twice for a given prompt, every victim receives a unique version of the malware, rendering hash-based detection useless.
- **Context Awareness**: The attack can include environmental details (browser version, language, screen resolution) in the prompt, allowing the LLM to tailor the malicious code to the specific victim.
- **Prompt Chaining**: Using multiple LLM calls to build a multi-stage attack, where Stage 1 determines the environment and Stage 2 generates the appropriate exploit.
## Indicators of Compromise
- **File Hashes**: Historically ineffective due to the polymorphic nature of LLM outputs.
- **File Names**: Often spoofed as legitimate library files (e.g., `jquery-3.7.1.min.js`, `analytics.js`).
- **Network Indicators**:
- Connections to legitimate AI API endpoints (e.g., `api.openai[.]com`).
- Connections to attacker-controlled exfiltration points (e.g., `https[:]//secure-verify-update[.]xyz/log`).
- **Behavioral Indicators**:
- High frequency of `eval()` or `new Function()` calls immediately following an API response.
- Unexpected outbound calls to LLM service providers from non-development related websites.
## Associated Threat Actors
- While specific named groups have not been attributed to this novel method yet, it is considered a **General Emerging Threat** likely to be adopted by Sophisticated Phishing Actors and Initial Access Brokers (IABs).
## Detection Methods
- **Behavioral Detection**: Monitoring for the "Request-Response-Execute" pattern (API call to an AI service followed by immediate execution of the received string).
- **Heuristic Analysis**: Identifying scripts that lack functional logic but contain complex natural language prompts intended for LLMs.
- **Entropy Analysis**: Detecting the runtime insertion of high-entropy code segments into the DOM.
- **Network Traffic Analysis**: Inspecting payloads from known AI API endpoints for code-like structures (though this is difficult with encryption).
## Mitigation Strategies
- **Content Security Policy (CSP)**: Implement strict CSP headers that disallow `unsafe-eval` and restrict `connect-src` to known, trusted domains, excluding AI API endpoints unless necessary.
- **Subresource Integrity (SRI)**: Use SRI to ensure that only authorized scripts are loaded.
- **Browser Isolation**: Use Remote Browser Isolation (RBI) to execute untrusted web content in a containerized environment.
- **API Monitoring**: Organizations should monitor and potentially rate-limit outbound traffic to LLM providers from standard user segments.
## Related Tools/Techniques
- **HTML Smuggling**: Used to deliver the initial "stub" that calls the LLM.
- **Polymorphic Malware**: The traditional predecessor to LLM-generated code.
- **Living off the Land (LotL)**: In this case, "Living off the LLM," using legitimate services to conduct malicious activity.