Full Report
URL parsing is super hard to do properly. There is a standard that has been updated and changed over the years. Additionally, if there is a difference in verification vs. use at any point, this can lead to massive security issues. In this article, they were looking at an SVG parser. Previous research has showed that the Inkscape CLI parser is vulnerable to path traversal within rendered SVGs. Within this parser, the XInclude format is also supported, a method of merging XML documents (SVGs are just XML). The underlying library for Inkscape is librsvg, which Canva uses. Within librsvg, every URL goes through validation to ensure it is not malicious. For instance, being able to include arbitrary local files would be a big no-no. The rules are VERY strict, making it relatively safe to use. The parser for validation is done with one parser but the loading of the SVG is done using the parser Gio. Anytime there are two parsers operating on the same data, there is likely many bugs lurking. A slight misunderstanding on one end could lead to the break you need. The authors of the post setup a fuzzer to test the differences in the file resolve process. While doing this, they noticed that current.svg?../../../../../../../etc/passwd passes the validation but can resolve files. How is this? From my understanding, the ? gets stripped from the resolver and is unhandled by the validator. The Gio parser will happily traverse files and traverse further up. A canonicalization process is done on the link. As it turns out, it starts in the directory in which the program is at. So, placing a . at the beginning can be used to force the program to traverse further up. The full SVG link is .?../../../../../../../etc/passwd. To me, this really shows the power of differential fuzzing. Who would have thought about a question mark in the path? Not me, only the fuzzer. At the end, they note some interesting things going forward for similar research. My favorite is that the file URIs support query strings, but this varies depending on the library.
Analysis Summary
# Vulnerability: Path Traversal in librsvg via URL Parser Differential (CVE-2023-38633)
## CVE Details
- **CVE ID:** CVE-2023-38633
- **CVSS Score:** 7.5 (High) - *Estimate based on typical Path Traversal in libraries*
- **CWE:** CWE-22 (Improper Limitation of a Pathname to a Restricted Directory), CWE-444 (Inconsistent Interpretation of HTTP Requests/URL Parsing)
## Affected Systems
- **Products:** librsvg
- **Versions:** versions prior to 2.56.3
- **Configurations:** Systems using librsvg to render user-provided SVGs (common in thumbnail generation services, e.g., Canva, or CLI tools like Inkscape and libvips).
## Vulnerability Description
The vulnerability arises from a **parser differential** between the Rust-based validation logic in `librsvg` and the underlying `GIO` (Glib) library used to actually fetch resources.
`librsvg` uses `xi:include` to fetch external resources but implements strict validation to ensure files are relative to the SVG's base directory and do not traverse upwards. However, the validator does not correctly handle query strings (`?`) in file URIs. It perceives a path like `.?../../etc/passwd` as a file named `.` with a query string, which it considers "within" the allowed directory.
When the path is passed to the `GIO` resolver, the resolver strips the query string and its contents but treats the leading `.` as a valid starting point to then process the `../` sequences. This allows an attacker to escape the restricted directory and read arbitrary files from the host system.
## Exploitation
- **Status:** PoC available.
- **Complexity:** Low.
- **Attack Vector:** Network (if the application accepts and renders remote SVGs).
- **PoC Example:**
xml
<xi:include href=".?../../../../../../../etc/passwd" parse="text" />
## Impact
- **Confidentiality:** High. An attacker can read any file on the local file system that the process owner has permissions to access (e.g., `/etc/passwd`, source code, configuration files, or cloud metadata credentials).
- **Integrity:** None.
- **Availability:** None.
## Remediation
### Patches
- **librsvg:** Update to version **2.56.3** or later. The patch ensures that query strings in file URIs are properly handled or rejected during validation.
### Workarounds
- **Input Sanitization:** Reject SVGs containing `xi:include` or `XInclude` namespaces before they reach the rendering engine (as implemented by MediaWiki).
- **Sandboxing:** Run rendering processes in restricted environments (containers, jails, or with low-privilege users) to limit the impact of a file read.
## Detection
- **Indicators of Compromise:** Look for SVG files containing `xi:include` tags with path traversal sequences (`../`) or unexpected characters like `?` in the `href` attribute.
- **Detection methods:** Use Grep or YARA rules to scan incoming XML/SVG uploads for the string `xi:include`.
## References
- **Vendor Advisory:** hxxps[://]gitlab[.]gnome[.]org/GNOME/librsvg/-/issues/996
- **NVD:** hxxps[://]nvd[.]nist[.]gov/vuln/detail/CVE-2023-38633
- **Original Research:** hxxps[://]www[.]canva[.]dev/blog/engineering/when-url-parsers-disagree-cve-2023-38633/