Full Report
GPUs are parallel and fast co-processors. They are designed to handle high throughout graphics and machine learning workloads. GPUs are made up of compute units for various computations, all of which have global memory. These have both compute and memory components. Some of the GPUs have section of memory called local memory for a given compute unit. This is a cache for processing elements within a given compute unit where global memory is too slow. The execution model is different than most programs called kernels. A GPU program is written in a Shader language like OpenCL, Metal or Vulkan with a single entrypoint function to be executed by various invocations. The vulnerability is that the local memory of programs being executed by different users on a compute unit are not properly cleared. As a result, it's possible to steal information across the different program runs! For instance, if there's a GPU job being executed by one process then a malicious process could execute a job directly after this one to steal information from local memory of the original one. The rest of the post is doing this on specific platforms and actually extracting the information from various platforms. The main interesting thing is that all applications on various platforms (like Android) have access to GPUs, making any application an potential attacker to exploit this. Although this is interesting, I don't think it's worth putting into this post but may be worth coming back to. The disclosure process was done through all GPU providers such as Apple, AMD, ARM and many others. Many of these were fixed, which is awesome. Overall, a good post for a relatively simple bug. I personally felt that it was theatricized too much with the name, impact, images, etc. I just love when bugs are talked about and explained :)
Analysis Summary
# Vulnerability: LeftoverLocals - GPU Local Memory Data Leakage Across Invocations
## CVE Details
- CVE ID: CVE-2023-4969
- CVSS Score: Not specified in the text, but the impact suggests **High** severity due to potential data theft (Confidentiality impact). (The provided reference link points to a CERT page which may contain the score, but it is not present in the source text).
- CWE: Likely CWE-401 (Memory Leak) or related to improper resource clearing/isolation.
## Affected Systems
- Products: GPUs from **Apple, AMD, and Qualcomm** are confirmed to be impacted by the vulnerability. Imagination GPUs were also confirmed to be impacted by Google testing.
- Versions: Specific vulnerable versions are **not comprehensively listed**, but testing showed impact on older Apple hardware, while newer hardware (Apple A17/M3 series) were confirmed to have fixes. Qualcomm firmware update mentioned is **v2.07**.
- Configurations: Any configuration running GPU workloads (kernels) on the affected vendors' hardware where local memory is used and not properly cleared between process/user executions. Significant risk highlighted for **LLM/ML workloads** running across different users or containers.
## Vulnerability Description
The vulnerability, dubbed "LeftoverLocals," stems from the **improper clearing of GPU local memory** between the execution of kernels, particularly when different users or processes run successive jobs on the same compute unit. GPU local memory acts as a high-speed, software-managed cache for processing elements within a compute unit. The vulnerability allows a malicious process executing a kernel shortly after a target process (e.g., an LLM session) to read sensitive data remaining in the local memory allocated to the previous execution. This enables **cross-process/cross-user information leakage**. For example, reconstructing ~181 MB of an LLM response was demonstrated across sequential kernel launches.
## Exploitation
- Status: **PoC available**. The article describes building a Proof of Concept to steal LLM responses.
- Complexity: Implied **Low to Medium**, as it relies on executing sequential, standard GPU kernels on shared hardware.
- Attack Vector: **Local** (requiring the attacker to schedule a malicious job immediately following a target job on the same shared GPU resource, such as within a multi-tenant environment or on a shared consumer device).
## Impact
- Confidentiality: **High** (Ability to recover sensitive data, demonstrated by reconstructing LLM responses).
- Integrity: Low (The attack is primarily for leakage).
- Availability: Low (The attack does not directly impact system availability).
## Remediation
### Patches
- **Imagination:** Fix released in DDK release **23.3** (available to customers in December 2023).
- **Qualcomm:** Patch to firmware version **v2.07** addresses the issue for *some* devices.
- **Apple:** Fixes confirmed in **A17 and M3 series processors**. Older devices may be patched, but specifics are pending full disclosure.
- **AMD, Intel, ARM:** Status ongoing or confirmed not impacted (NVIDIA).
### Workarounds
- **Vendor Disclosure:** As this affects hardware/firmware implementations, the primary workaround is immediate application of vendor-provided security updates.
- **Resource Separation:** Avoiding running highly sensitive jobs (like LLM inference) on the same physical GPU resources shared by untrusted co-residents (e.g., processes/containers) until patches are confirmed applied.
## Detection
- Detection methods are not detailed in the summary, but monitoring for unusual sequential kernel executions scheduled by different users on the same compute unit might be relevant in environments running untrusted workloads.
- **Indicators of Compromise:** Successful extraction of data remnants from GPU local memory following a target job.
## References
- Vendor Advisory (Imagination): `https://www.imaginationtech.com/gpu-driver-vulnerabilities/`
- Related Firmware Update (Qualcomm): `https://lore.kernel.org/linux-firmware/[email protected]/T/#u`
- CERT Vulnerability Link: `https://kb.cert.org/vuls/id/446598`