Full Report
Even limited voter rolls can be linked to identify people, research shows Your voter data could be used against you. A foreign intelligence service that wished to identify the family members of deployed military personnel could do so by cross-referencing public voter record data and social media posts.…
Analysis Summary
# Research: Public Voting Records: A Record, or an Attack Surface?
## Metadata
- **Authors:** Noah M. Kenney
- **Institution:** Digital 520 (Consultancy)
- **Publication:** Technical Analysis / Research Paper (reported via *The Register*)
- **Date:** May 4, 2026 (Article Publication Date)
## Abstract
This research examines the systemic privacy vulnerabilities inherent in United States public voter rolls. By cross-referencing voter data from Texas and North Carolina with secondary datasets—such as FEC contribution records and social media—the study demonstrates that current redaction methods are insufficient. The research highlights how voter records can be weaponized for targeted identification of military families, workplace discrimination, and automated identity fraud.
## Research Objective
The study seeks to answer:
1. To what extent can individuals be re-identified by linking "limited" public voter records with other open-source datasets?
2. Does the redaction of specific fields (like Date of Birth) significantly improve voter privacy?
3. What are the practical national security and civil liberty risks posed by the current disclosure regimes?
## Methodology
### Approach
The researcher utilized a **Data Linking (Record Linkage) Attack** methodology. By performing "inner-joins" between disparate datasets using common keys (Name, ZIP, Phone), the study measured the success rate of individual re-identification.
### Dataset/Environment
- **Primary Data:** Voter rolls from Travis County, Texas (restrictive regime) and Robeson County, North Carolina (permissive regime).
- **Secondary Data:** Federal Election Commission (FEC) individual contribution data (2024 cycle) and open-source social media/public records.
### Tools & Technologies
- **Python:** Used for data processing and script-based record linkage.
- **FEC OpenAPI:** Used to pull 500 contribution records for the 78704 ZIP code.
- **Data Filtering:** De-duplication and exact-match joining (notably avoiding fuzzy matching to demonstrate the ease of the attack).
## Key Findings
### Primary Results
1. **High Re-identification Rates:** Name and ZIP code alone uniquely identify **95.81%** of Texas voters and **87.79%** of North Carolina voters.
2. **Redaction Failure:** In Texas, despite the redaction of full Dates of Birth, the publication of "Voter Registration Date" allows **28%** of voters to be uniquely identified when combined with ZIP and gender.
3. **Behavioral Fingerprinting:** Among frequent voters (20+ elections), **98.4%** have a turnout pattern (which years/primaries they voted in) that is unique to them, acting as a "voter fingerprint."
4. **Phone Number Vulnerability:** In North Carolina, **88.53%** of listed phone numbers are unique to a single voter within the county, serving as a highly effective primary key for data merging.
### Supporting Evidence
- **FEC Join Success:** Using a basic Python script with no fuzzy matching, **52.49%** of FEC contributors in an Austin ZIP code were uniquely matched to their voter record.
- **Employer Data Exposure:** Of the successful matches, **74.3%** also exposed their employer through the FEC file, creating a bridge from voter rolls to professional identity.
### Novel Contributions
- Identifies **"Suspense Indicators"** (returned mail codes) as a novel vector for identity fraud rings to target addresses for takeover.
- Quantifies the exposure of **military families** by identifying 320 families in one county through APO/FPO mailing codes.
## Technical Details
The research challenges the "k-anonymity" of voter rolls. While a ZIP code might contain thousands of people, the intersection of **[Name + ZIP]** or **[Registration Date + ZIP + Gender]** reduces the anonymity set to $n=1$ in the vast majority of cases. The study proves that "turning off" a single sensitive field like Date of Birth is ineffective if highly correlated variables (like the specific day a person registered to vote) remain public.
## Practical Implications
### For Security Practitioners
- **Intelligence Risks:** Foreign Intelligence Services (FIS) can use these joins to build "target packages" on government employees or decentralized military units by correlating social media data with physical addresses found in voter rolls.
### For Defenders
- **State-Level Recommendations:** Counties should generalize data (e.g., provide registration *year* rather than the specific *day*).
- **Access Controls:** Move away from open ZIP downloads toward authenticated access, rate-limiting, and audit logging for public records requests.
### For Researchers
- Highlights the need for more robust **Differential Privacy** applications in the publication of civic data.
- Suggests further investigation into how AI/LLMs can automate fuzzy matching at scale to reach 90%+ re-identification rates.
## Limitations
- The study focused on two specific counties; results may vary in states with stricter "Address Confidentiality Programs" (ACP).
- The researcher noted that match rates would likely be significantly higher if utilizing commercial-grade "data broker" tools rather than simple Python scripts.
## Real-world Applications
- **Political Discrimination:** Employers cross-referencing applicants with primary ballot history to filter for political alignment.
- **Targeted Harassment:** Using unique turnout patterns to identify and dox individuals based on their civic participation.
## Future Work
- Evaluating the effectiveness of the proposed "Secure Data Act" and other federal privacy frameworks.
- Expanding the study to include the role of AI in bridging diverse, non-structured datasets (e.g., matching voter rolls to leaked credentials).
## References
- Sweeney, L. (2000). *Simple Demographics Often Identify People Uniquely*.
- Kenney, N. M. *Public Voting Records: A Record, or an Attack Surface?* (noahkenney[.]com/research-voter-privacy[.]html)