Mon Jan 1, 0001

Structural Points: Connective Tissue for the Essay

Carnegie Mellon study (Aleecia McDonald and Lorrie Faith Cranor, 2008, “The Cost of Reading Privacy Policies”): average person would need ~76 work days per year to read all privacy policies they encounter. Consent is structurally meaningless at scale. Every data collection practice documented in these research files occurs under terms nobody reads.

Public Records as the Base Layer

Most data collection builds on top of freely available public records:

  • Voter registration: public record in most states. Contains name, address, party affiliation, voting history (which elections, not which candidates). Political campaigns and data firms purchase in bulk.
  • Property records: real estate transactions, ownership, tax assessments — all public.
  • Court records (PACER): arrest records, civil suits, divorces, bankruptcies — searchable online. Arrest records appear even when charges dismissed (see social-credit-fincen.md: FBI rap sheets 50% missing disposition data).
  • Business filings: corporate registrations, LLC filings, UCC filings — public.
  • Death records, marriage records, birth indices — public in many states.

These are the join keys. A name + address from voter rolls + purchase history from loyalty cards + location from phone + browsing from ISP + social graph from Facebook = complete profile. Each database owner says “we only have a small piece.” The pieces are trivially joinable against public records.

HIPAA’s Actual Limitations

HIPAA applies ONLY to “covered entities”: healthcare providers, health plans, and clearinghouses. NOT covered:

  • Google searches for symptoms
  • Health/fitness apps (Fitbit, Apple Health, MyFitnessPal)
  • Genetic testing (23andMe — not a covered entity)
  • Pharmacy loyalty cards (the purchase data, not the prescription)
  • Period tracking apps
  • Employer wellness programs (many operate outside HIPAA)
  • Data brokers compiling health-related purchase patterns

Most health-related data generated in 2026 is NOT protected by the law people assume protects it.

Data Permanence / No US Right to Be Forgotten

EU: GDPR Article 17 (right to erasure), effective May 2018. Individuals can demand deletion of personal data under specified conditions.

US: no federal equivalent. No right to erasure. Data collected in 2005 remains in databases indefinitely. Every breach (OPM, Equifax, 23andMe) is permanent — the data cannot be “unbreached.” Biometrics (fingerprints, facial geometry) cannot be rotated like passwords.

California CCPA/CPRA (2020/2023) provides limited deletion rights for California residents. Virginia, Colorado, Connecticut, Utah have similar but weaker laws. No federal data privacy law as of 2026.

Children’s Data

COPPA (Children’s Online Privacy Protection Act, 1998): covers under-13, poorly enforced. Children growing up today have comprehensive digital footprints before they can consent: school surveillance (GoGuardian: 27M students), social media, health apps, family DNA tests (23andMe captures relatives), smart home devices recording their voices.

The Aggregation Problem (Stated Simply)

No single database is “the problem.” The problem is that they’re all joinable:

  • Information theory (research/information-theory.md): ~33 bits uniquely identify anyone
  • Each data source contributes bits
  • Public records provide the join keys (name, address, DOB)
  • Data brokers (research/data-brokers.md) exist specifically to perform the joins
  • The mosaic theory (research/metadata-kills.md) is the legal articulation
  • The Hayden quote is the operational articulation: “We kill people based on metadata”