Full Report
Buzzy Chinese artificial intelligence (AI) startup DeepSeek, which has had a meteoric rise in popularity in recent days, left one of its databases exposed on the internet, which could have allowed malicious actors to gain access to sensitive data. The ClickHouse database "allows full control over database operations, including the ability to access internal data," Wiz security researcher Gal
Analysis Summary
# Incident Report: Exposure of DeepSeek Production Database
## Executive Summary
The fast-growing Chinese AI startup DeepSeek suffered a significant data exposure due to an unauthenticated, accidentally internet-facing ClickHouse database. This exposure granted potential attackers full administrative control, including access to over a million lines of sensitive data such as chat histories, secret keys, API secrets, and backend details. The vulnerability was identified and reported by security researchers at Wiz, leading to the immediate patching of the exposed endpoints.
## Incident Details
- Discovery Date: Undisclosed (Following Wiz's contact attempt)
- Incident Date: Undisclosed (Exposure period)
- Affected Organization: DeepSeek
- Sector: Artificial Intelligence (AI) / Technology
- Geography: China (Primary operations)
## Timeline of Events
### Initial Access
- **Date/Time:** Unknown prior to discovery.
- **Vector:** Accidental public exposure of a ClickHouse database via HTTP interface.
- **Details:** The database, hosted at `oauth2callback.deepseek[.]com:9000` and `dev.deepseek[.]com:9000`, was accessible without any authentication.
### Lateral Movement
- **Details:** The vulnerability itself allowed for immediate unauthorized access, bypassing the need for traditional lateral movement. Attackers could leverage the HTTP interface to execute arbitrary SQL queries, potentially leading to privilege escalation within the DeepSeek environment.
### Data Exfiltration/Impact
- **Details:** Over a million lines of log streams were exposed, including highly sensitive information such as chat history, secret keys, backend configuration details, API Secrets, and operational metadata.
### Detection & Response
- **How it was discovered:** Discovered by security researchers at Wiz.
- **Response actions taken:** Wiz contacted DeepSeek, and the security hole was subsequently plugged. DeepSeek acknowledged the issue in an update on January 29, 2025, stating they were implementing a fix.
## Attack Methodology
- **Initial Access:** Unauthenticated access to the ClickHouse database's HTTP interface.
- **Persistence:** Not applicable as the vector was direct database exposure.
- **Privilege Escalation:** Potential for direct privilege escalation within the environment by executing administrative SQL queries.
- **Defense Evasion:** The initial vulnerability bypasses standard network defenses due to being an accidental public-facing service.
- **Credential Access:** Exposed API secrets and keys contained within the database logs.
- **Discovery:** Unknown, but the database exposure itself provided immediate discovery of sensitive data.
- **Lateral Movement:** Achieved implicitly via SQL command execution within the database instance.
- **Collection:** Direct querying of the database to gather chat histories, metadata, and secrets.
- **Exfiltration:** Unknown if data was exfiltrated, but the capability was present.
- **Impact:** Confidentiality breach of sensitive internal and user data.
## Impact Assessment
- **Financial:** Not disclosed. The company also faced operational disruption due to a related temporary pause in registrations/service availability amidst scrutiny.
- **Data Breach:** Over a million lines of log streams exposed, including customer chat history, internal secret keys, API Secrets, and backend system details.
- **Operational:** The company paused registrations and faced service availability issues in certain regions (e.g., Italy) amid related regulatory scrutiny.
- **Reputational:** Significant reputational damage compounded by broader scrutiny regarding data privacy, Chinese ties, and potential intellectual property theft from competitors like OpenAI/Microsoft.
## Indicators of Compromise
- **Network indicators:** `oauth2callback.deepseek[.]com:9000`, `dev.deepseek[.]com:9000`
- **File indicators:** N/A (Database exposure)
- **Behavioral indicators:** Unauthorized execution of administrative SQL queries against the ClickHouse instance via HTTP.
## Response Actions
- **Containment measures:** Immediately disabling or securing the exposed ClickHouse database endpoints.
- **Eradication steps:** Not explicitly detailed, but presumed to involve credential rotation for any keys exposed in the logs.
- **Recovery actions:** Resuming normal operations and addressing service issues/registration pauses.
## Lessons Learned
- **Key takeaways:** Rapid adoption of new technologies (like AI infrastructure) must be accompanied by fundamental security hygiene, such as checking public accessibility of databases. Basic misconfigurations (like exposed databases) represent critical, immediate threats.
- **What could have been done better:** Implementing rigorous security auditing (Cloud Security Posture Management) before deploying services to the public-facing internet.
## Recommendations
- Implement mandatory authentication and network segmentation/firewalling for all database instances, especially those hosting production data.
- Conduct regular configuration audits and vulnerability scanning (using tools like CSPM) to identify publicly exposed internal services.
- Immediately rotate all secrets, API keys, and encryption keys identified as being present in raw log data or exposed datasets.