Researchers Find 'Anonymized' Data Is Even Less Anonymous Than We Thought

Corporations love to pretend that ‘anonymization’ of the data they collect protects consumers. Studies keep showing that’s not really true. From a report:

Last fall, AdBlock Plus creator Wladimir Palant revealed that Avast was using its popular antivirus software to collect and sell user data. While the effort was eventually shuttered, Avast CEO Ondrej Vlcek first downplayed the scandal, assuring the public the collected data had been “anonymized” – or stripped of any obvious identifiers like names or phone numbers. “We absolutely do not allow any advertisers or any third party…to get any access through Avast or any data that would allow the third party to target that specific individual,” Vlcek said. But analysis from students at Harvard University shows that anonymization isn’t the magic bullet companies like to pretend it is.

*Dasha Metropolitansky and Kian Attari, two students at the Harvard John A. Paulson School of Engineering and Applied Sciences, recently built a tool that combs through vast troves of consumer datasets exposed from breaches for a class paper they’ve yet to publish.

“The program takes in a list of personally identifiable information, such as a list of emails or usernames, and searches across the leaks for all the credential data it can find for each person,” Attari said in a press release.

They told Motherboard their tool analyzed thousands of datasets from data scandals ranging from the 2015 hack of Experian, to the hacks and breaches that have plagued services from MyHeritage to porn websites. Despite many of these datasets containing “anonymized” data, the students say that identifying actual users wasn’t all that difficult.

  • “An individual leak is like a puzzle piece,” Harvard researcher Dasha Metropolitansky told Motherboard . “On its own, it isn’t particularly powerful, but when multiple leaks are brought together, they form a surprisingly clear picture of our identities. People may move on from these leaks, but hackers have long memories.”*
2 Likes