Information Theory / Deanonymization
Latanya Sweeney (1997)
87% of the U.S. population uniquely identifiable using 5-digit ZIP code, gender, and date of birth.
Golle (2006) found 63% uniquely identifiable from the same three fields using different methodology.
Bits of Information
- ZIP code: ~23.8 bits
- Birthday: ~8.5 bits
- Gender: ~1 bit
- Total: ~33.3 bits
- U.S. population (~330M) requires ~28.2 bits (log2 of 330,000,000) to uniquely identify one person
- ZIP + birthday alone = ~32.3 bits
Browser Fingerprinting
EFF Panopticlick (2010, now “Cover Your Tracks”), tested 470,161 browsers:
- 83.6% uniquely identifiable
- 94.2% unique among users with Flash or Java enabled
- Average fingerprint: ~18.1 bits of identifying information
AmIUnique.org (Laperdrix et al., 2014): 89.4% of desktop browsers unique.
Fingerprint components: canvas rendering, installed fonts, screen resolution, color depth, WebGL renderer/vendor, timezone, language, AudioContext API, User-Agent string, HTTP Accept headers.
No cookies required. No opt-out mechanism. Passive collection.
Sources
- https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy
- https://www.eff.org/deeplinks/2023/11/debunking-myth-anonymous-data
- https://www.johndcook.com/blog/2018/03/02/bits-of-information-in-age-birthday-and-birthdate/
- https://www.americanscientist.org/article/uniquely-me
- Golle 2006: https://www.privacylives.com/wp-content/uploads/2010/01/golle-reidentification-deanonymization-2006.pdf
- Eckersley 2010: https://panopticlick.eff.org/static/browser-uniqueness.pdf
- https://coveryourtracks.eff.org/
- https://amiunique.org/