Full Report
In a previous blog post, Doyensec detailed how to exploit CSPT to perform CSRF by using file uploads to transfer data for routing in a subsequent request. In their example, there were no restrictions on the file upload functionality, but this isn't always the case. So, they detail some ways to add JSON files to the server in unintended ways. The mmmagic library in NodeJs is used for file type detection. PDFs are notirous for being lax in their format. By creating JSON and placing %PDF in the JSON at all, it'll be considered valid PDF and valid JSON. It just needs to be within the first 1024 bytes. In pdflib, it requires more than the PDF header. There is a polyglot technique that can be used to do this. The trick is to replace %0A between PDF objects with spaces. Then, open a double quote with the PDF header and other valid-looking PDF data. The file has strict limits on input size. By making the sizes too large to handle, it may revert to the default file type. In many ways, this should trigger an error, but that apparently differs on the system. This isn't a vulnerability class by itself. However, it DOES help in the exploitation. Good post on CSPT exploitation!
Analysis Summary
# Tool/Technique: JSON-Polyglot File Upload Bypass for CSPT
## Overview
This technique involves crafting "polyglot" files that are simultaneously valid JSON and valid instances of other file formats (such as PDF or WebP). These files are used as gadgets to exploit Client-Side Path Traversal (CSPT) vulnerabilities. The primary purpose is to bypass server-side file upload restrictions (MIME-type validation) to place a malicious JSON payload on a server, which is later executed or parsed by a client-side application to perform actions like Cross-Site Request Forgery (CSRF).
## Technical Details
- **Type:** Technique (Exploitation/Bypass)
- **Platform:** Web Applications (specifically Node.js-based backends and V8-based frontends)
- **Capabilities:**
- Bypasses signature-based file validation (Magic bytes).
- Bypasses structural validation for PDF and Image formats.
- Achieves path traversal via JSON keys (e.g., `_id: "../payload"`).
- **First Seen:** January 2025 (Detailed by Doyensec)
## MITRE ATT&CK Mapping
- **TA0001 - Initial Access**
- **T1190 - Exploit Public-Facing Application**
- **TA0005 - Defense Evasion**
- **T1027 - Obfuscated Files or Information** (Polyglots)
- **T1036 - Masquerading**
- **TA0008 - Lateral Movement**
- **T1557 - Adversary-in-the-Middle** (Specifically via CSRF/CSPT)
## Functionality
### Core Capabilities
- **Magic Byte Injection:** Inserting headers like `%PDF` within the first 1024 bytes of a JSON object to satisfy `mmmagic` library checks.
- **Structural Polyglots:** Constructing a valid PDF object tree within a JSON string value, replacing line feeds (`%0A`) with spaces (`%20`) to maintain JSON validity while satisfying `pdflib` structural scanners.
- **Offset Manipulation:** Aligning specific magic strings (e.g., `WEBP`) at specific byte offsets (e.g., offset 8) required by libraries like `file-type` while maintaining the file as a valid JSON object.
### Advanced Features
- **JSON Compatibility:** Leveraging the V8 JSON parser's tolerance for leading whitespace and specific control characters to ensure the "file" remains a valid argument for `JSON.parse()`.
- **CSPT Integration:** Using the bypassed upload to host a JSON gadget that contains path traversal sequences to redirect client-side requests.
## Indicators of Compromise
- **File Hashes:** N/A (Payloads are dynamically generated and context-specific).
- **File Names:** Frequently end in `.pdf`, `.webp`, `.png`, or `.jpg` but contain ASCII JSON content.
- **Network Indicators:** Requests to endpoints that result in path traversal sequences (e.g., `GET /api/user?id=../../gadget.json`).
- **Behavioral Indicators:**
- File uploads containing mixed-format signatures (e.g., a file starting with `{` but containing `%PDF-1.4`).
- Web server logs showing `200 OK` for PDF/Image uploads that contain cleartext JSON strings.
## Associated Threat Actors
- Research-driven exploitation; no specific APT group assigned, but the technique is applicable to red team operations and bug bounty researchers.
## Detection Methods
- **Signature-based detection:**
- YARA rules targeting files that start with JSON characters (`{`, `[`) but contain file signatures like `%PDF` or `WEBP` within the first 1KB.
- **Behavioral detection:**
- Monitoring for `JSON.parse` calls in the frontend that consume files with non-JSON extensions (e.g., `.pdf`).
- Inspecting upload traffic for path traversal characters (`../`) within JSON fields.
## Mitigation Strategies
- **Prevention measures:**
- Use strict JSON schema validation on the server side after a file is uploaded.
- Implement Content Security Policy (CSP) to restrict where scripts can fetch data from.
- **Hardening recommendations:**
- Avoid using user-controlled input to construct file paths or API endpoints in client-side code.
- Use "Content-Type: application/json" strictly and enforce it via `X-Content-Type-Options: nosniff`.
- Rename uploaded files to random strings and store them in a non-executable, isolated directory/bucket.
## Related Tools/Techniques
- **CSPT2CSRF:** The broader attack class of using Client-Side Path Traversal to trigger CSRF.
- **CSPTPlayground:** A tool/repository for testing these vulnerabilities.
- **File Upload Polyglots:** General class of files that are valid in two or more formats (e.g., GIFAR, HTA/JPEG).