Full Report
Recently, there was a bypass in DOMPurify when it's used to sanitize XML documents. Since bugs come in pairs, the author looked and found two more variants of the bug in the codebase. For context, DOMPurify is the go-to method for ensuring that rich text doesn't contain malicious JS. XML and HTML have different parsing rules. For example, Hello)"> ?> is parsed as a single node of XML but HTML recognizes an h1 tag. When encountering a in XML, this is part of the spec. However, in HTML, this is processed as a bogus comment. So, there is a mismatch between the HTML and XML parsing. The Processing Instructions allows us to bypass if the sanitized XML document is later later processed as an HTML document. Is this a normal case? I suppose that this is something that should be reported. They found a similar issue within the [CDATA[ of an XML portion as well. It seems that different contexts are a common bypass for DOMPurify. So, interesting!
Analysis Summary
# Vulnerability: DOMPurify XML/HTML Parsing Ambiguity Bypass
## CVE Details
- **CVE ID:** CVE-2024-30176 (Processing Instructions bypass) / CVE-2024-30187 (CDATA section bypass)
- **CVSS Score:** 6.1 (Medium)
- **CWE:** CWE-6013 (Identity Confusion), CWE-79 (Cross-site Scripting)
## Affected Systems
- **Products:** DOMPurify (Library)
- **Versions:**
- Ver. 3.0.10 and below (Processing Instruction vulnerability)
- Ver. 3.0.11 and below (CDATA Section vulnerability)
- **Configurations:** Systems using DOMPurify to sanitize XML/XHTML content (`PARSER_MEDIA_TYPE: "application/xhtml+xml"`) which is subsequently rendered or processed in an HTML context.
## Vulnerability Description
The vulnerability stems from a "differential parsing" flaw where XML and HTML parsers interpret specific tags differently.
1. **Processing Instructions (PI):** In XML, a PI ends with `?>`. In HTML, a PI is treated as a "bogus comment" that terminates at the first `>`.
2. **CDATA Sections:** In XML, CDATA ends with `]]>`. In HTML, CDATA is treated as a bogus comment ending at the first `>`.
Attackers can wrap malicious HTML (like `<img src=x onerror=alert(1)>`) inside these XML constructs. Because DOMPurify 3.0.10/11 did not adequately account for these nodes or their naming conventions when parsing XML, the malicious payload remained hidden during sanitization but became active executable HTML when rendered in a browser's HTML namespace.
## Exploitation
- **Status:** PoC available; documented in research.
- **Complexity:** Low
- **Attack Vector:** Network (Web)
## Impact
- **Confidentiality:** Low (Session hijacking/cookie theft via XSS)
- **Integrity:** Low (Unauthorized actions on behalf of the user)
- **Availability:** None
## Remediation
### Patches
- **Upgrade to DOMPurify 3.1.0 or later.**
- The fixes include:
- Explicitly identifying and removing `NodeFilter.SHOW_PROCESSING_INSTRUCTION`.
- Explicitly identifying and removing `NodeFilter.SHOW_CDATA_SECTION`.
- Ensuring `nodeName` confusion (where a PI could be named after an allowed tag like `h1`) is mitigated by checking `nodeType`.
### Workarounds
- Ensure that content sanitized as XML is never directly inserted into an HTML context using `innerHTML`. Use `textContent` where possible.
- Avoid using `PARSER_MEDIA_TYPE: "application/xhtml+xml"` if the source is untrusted and the destination is an HTML document.
## Detection
- **Indicators of Compromise:** Presence of payloads containing `<?[allowed_tag] ?>` or `<![CDATA[ ... ]]>` followed by HTML event handlers (e.g., `onerror`, `onload`).
- **Detection methods:** Static analysis of JavaScript dependencies to identify vulnerable DOMPurify versions. Review of codebases where XML sanitization results are piped into `innerHTML`.
## References
- [https://flatt.tech/research/posts/bypassing-dompurify-with-good-old-xml/](https://flatt.tech/research/posts/bypassing-dompurify-with-good-old-xml/)
- [https://github.com/cure53/dompurify/releases/tag/3.1.0](https://github.com/cure53/dompurify/releases/tag/3.1.0)
- [https://blog.slonser.info/posts/dompurify-node-type-confusion/](https://blog.slonser.info/posts/dompurify-node-type-confusion/)