Full Report
SP1 is a zero-knowledge virtual machine (zkVM) that enables developers to prove the execution of arbitrary programs that can be compiled to RISC-V. Most of the code that uses this is written in Rust though. The ZK circuits enable devs to write standard Rust code to generate their cryptographic proofs, instead of domain-specific languages. The goal of this post is to prime security auditors to review code that uses SP1. The SPL architecture is as follows: Compile the code into a RISC-V ELF binary. Execute the program in a zkVM. This will generate the STARK proof to be used later. Optimize and verify the proof. This is the mathematical verification that the code ran as intended. The system consists of two components: the prover and the verifier. The prover executes the guest program and generates the ZK proof. The verifier takes in the proof and validates the cryptographic assumptions of it. This should come from the prover but a malicious actor can submit whatever they want. If the verification succeeds, the claimed computation has occurred. The system is separated into Host and Guest systems. The Host is the standard machine that executes code, such as the machine you're using to view this website. The Guest program runs inside a VM that is completely separate from everything else. No Internet access, no databases, no nothing. When reading the code, the host and guest code is somewhat intertwined, making it an important distinction. The first security note is that all input data is untrusted. If input is coming from the HOST to the GUEST, then the inputs must be validated. Range checks, length checks, business logic constraints, etc. should all be done. On this note, only GUEST Code is proven - not code running on the HOST. So, if there's a check in the HOST that's not in the GUEST, you probably have a bug. SP1 uses 32-bit RISC-V. When coming or using 64-bit systems, this can cause issues. For instance, integer truncations and overflows should always be checked if dealing with usize values. On top of this integration issue, many dependencies attempting to be added to SP1 compiled code were not meant to be. This can lead to similar types of integer issues, operating system calls, unsafe code, and many other weird quirks. When using SP1, data can be committed to become a public output. Naturally, if we're doing zero-knowledge proofs, the public information should be carefully audited. For instance, disclosing someone's age would be inappropriate. Another issue that is weird to me is Verification Key Management. In SP1, each program generates two keys: one for the prover and another for the verifier. Each guest program must have a unique verification key derived from its binary and not allow older key versions. There are cases where information cannot be computed within the proof but rather statically as part of the output. For instance, a merkle proof can be generated. The validity is determined based upon the block hash associated with it. So, the block hash must be validated separately from the program. For SP1, you would want to make this a committed value as an output for external validation. The most common vulnerability is around "Underconstrained circuits". This is simply the insufficient validation of state transitions in a program. This is basic logic validation like most other things. According to the post, practical knowledge of STARKs/SNARKs isn't necessary for auditing SP1 programs, unlike other cryptographic primitives. A solid introduction to reviewing SP1 programs. I feel like this demystified a lot of terminology as well, which I really appreciated.
Analysis Summary
# Best Practices: Securing SP1 zkVM Implementations
## Overview
These practices are tailored for security auditors reviewing code that utilizes the SP1 zero-knowledge virtual machine (zkVM). They focus on mitigating risks associated with untrusted inputs, architectural separation (Host vs. Guest), specific compilation targets (RISC-V 32-bit), and managing cryptographic assets like verification keys.
## Key Recommendations
### Immediate Actions
1. **Validate All Host-to-Guest Inputs:** Treat *all* data passed from the Host environment to the Guest program as untrusted. Implement rigorous range checks, length checks, and business logic constraint validation on these inputs within the Guest code.
2. **Ensure Host Validation Parity:** Verify that any security-critical checks performed in the **Host** code are *also* implemented and proven within the **Guest** code. The proof only covers the Guest execution.
3. **Scrutinize 64-bit to 32-bit Conversions:** Aggressively search for and validate any operations involving casting or conversion from 64-bit values (e.g., `usize` on 64-bit systems) down to the 32-bit registers used by the RISC-V target. Explicitly check for truncations and potential overflows.
4. **Audit Public Outputs:** Carefully review any data committed as a "public output." Ensure that this data does not inadvertently disclose sensitive private information (e.g., secrets, private measurements).
### Short-term Improvements (1-3 months)
1. **Enforce Unique Verification Key Derivation:** Implement a strict policy requiring that the verification key for every guest program is uniquely derived from its specific compiled binary. Block the use of older or cached key versions during verification setup.
2. **Isolate Logic: Host vs. Guest:** If Host and Guest code are intertwined, focus on isolating business logic within the Guest execution path as much as possible to maximize the cryptographic guarantees provided by the proof.
3. **Review Dependency Impact:** Analyze all dependencies intended for inclusion in the compiled Guest code. Check for unexpected OS calls, reliance on system-specific features, or excessive use of `unsafe` code patterns that might be compiled into unexpected RISC-V behavior.
### Long-term Strategy (3+ months)
1. **Establish External Validation Protocol:** For computations whose validity relies on external factors (like block hashes used in Merkle proofs), mandate that these external values must be committed as outputs from the Guest program. The system should implement a separate, non-ZK-based validation step post-proof verification to confirm these committed values.
2. **Mitigate Underconstrained Circuits:** Conduct dedicated security reviews focused on state transitions within the Guest program. Ensure sufficient validation logic exists to prevent insufficient constraints, which results in a successful proof for an incorrect computation state.
3. **Dependency Hardening Standard:** Create a formal standard for acceptable dependencies for SP1 compilation, explicitly flagging and prohibiting dependencies known to cause integer truncation issues or rely heavily on unimplemented OS primitives.
## Implementation Guidance
### For Small Organizations
- **Focus on Language Security:** Since most SP1 code is Rust, dedicate significant resources to auditing the Rust code for standard vulnerabilities (e.g., memory safety, panic behavior, error handling) *before* compilation, as the ZK proof guarantees execution, not inherent logic security.
- **Manual Input Review:** Given limited tooling, adopt a disciplined, manual process for tracing every byte of data flowing from the Host to the Guest to ensure validation rules are correctly applied at the boundaries.
### For Medium Organizations
- **Automated Boundary Testing:** Develop integration tests that specifically mock malicious Host inputs and assert that the Guest program correctly rejects or truncates them according to explicit range/length checks.
- **Tooling for Binary Analysis:** Investigate tools capable of statically analyzing the RISC-V ELF binary output to confirm 32-bit constraints are respected, helping flag subtle truncation issues introduced by the compiler or intermediate libraries.
### For Large Enterprises
- **Formal Verification Integration:** Explore integrating formal methods or symbolic execution tools focused on the original source language (e.g., Rust) to prove the correctness of boundary checks and state transitions before compilation to RISC-V.
- **Dedicated ZK Security Review Process:** Establish a formal security review stage focusing solely on the **Host-Guest interface** and the **Verification Key Management workflow**, separate from standard application code reviews and underlying ZK math reviews.
## Configuration Examples
*No specific configuration syntax was provided in the source material, but the following guidance stands in its place:*
**Verification Key Management Example Policy:**
"The system **must** reject any proof verification attempt where the provided Verification Key hash `VK_hash` does not cryptographically match the hash derived from the currently running Guest Program Binary `BIN_current`. `VK_hash(BIN_current) == VK_provided`."
## Compliance Alignment
- **NIST SP 800-53 (AC/IA Derivatives):** Focus on the Host/Guest boundary hardening, treating the Host as a less trusted environment interacting with the secured Guest execution environment.
- **ISO/IEC 27001 (A.14):** Directly applicable to the secure development and testing phases, particularly concerning input validation and dependency management for compiled binaries.
- **CIS Benchmarks (Specific to Host OS):** Ensure the underlying Host environment running the prover adheres to strong baseline security configurations, as it controls the environment where proof generation occurs.
## Common Pitfalls to Avoid
1. **Assuming Host Validation is Sufficient:** Believing that robust input checks in the Host process are enough. The Host can be compromised or deliberately misbehave; only Guest-validated logic is proven.
2. **Ignoring Integer Truncation:** Overlooking potential data loss or manipulation when mixing 32-bit RISC-V targets with standard 64-bit Rust primitives (`usize`).
3. **Reusing Verification Keys:** Allowing the system to use verification keys decoupled from the exact binary they are meant to validate, opening the door for using an old, potentially vulnerable key version.
4. **Logging Sensitive Data as Public Output:** Using committed public outputs for values that should remain private, thereby leaking sensitive program state.
5. **Neglecting Logic Flaws:** Focusing solely on the ZK proof mechanism and neglecting basic application logic flaws (underconstrained circuits), which remain the most common vulnerability type.
## Resources
(Note: Links provided in the context are conceptual references for learning, not direct tools.)
- **Language Security Guides:** Refer to established security guides for the primary implementation language (e.g., Rust Security Book) to establish a baseline for Guest code security.
- **RISC-V Architecture Documentation:** Consult official documentation to understand the exact behavior and limitations of the 32-bit instruction set, especially regarding arithmetic operations near memory boundaries.
- **Best Practice Documentation:** Adopt internal standards based on general cryptographic best practices regarding input verification and management of cryptographic secrets (Verification Keys).