Mon Jan 1, 0001

Information Theory / Deanonymization

Latanya Sweeney (1997)

87% of the U.S. population uniquely identifiable using 5-digit ZIP code, gender, and date of birth.

Golle (2006) found 63% uniquely identifiable from the same three fields using different methodology.

Bits of Information

  • ZIP code: ~23.8 bits
  • Birthday: ~8.5 bits
  • Gender: ~1 bit
  • Total: ~33.3 bits
  • U.S. population (~330M) requires ~28.2 bits (log2 of 330,000,000) to uniquely identify one person
  • ZIP + birthday alone = ~32.3 bits

Browser Fingerprinting

EFF Panopticlick (2010, now “Cover Your Tracks”), tested 470,161 browsers:

  • 83.6% uniquely identifiable
  • 94.2% unique among users with Flash or Java enabled
  • Average fingerprint: ~18.1 bits of identifying information

AmIUnique.org (Laperdrix et al., 2014): 89.4% of desktop browsers unique.

Fingerprint components: canvas rendering, installed fonts, screen resolution, color depth, WebGL renderer/vendor, timezone, language, AudioContext API, User-Agent string, HTTP Accept headers.

No cookies required. No opt-out mechanism. Passive collection.

Sources