zlacker

[parent] [thread] 0 comments
1. Commun+(OP)[view] [source] 2019-07-25 10:45:29
Part of the problem here too is that we may be dealing with more than one data set. If there is enough information overlap in those sets then you could map a record from one set to another. In that situation you'd have more data about each person, reducing the k-anonymity of the combined data sets to below the k-anonymity of either individual data set.

That concerns me most around places that process data for other companies (e.g., Cambridge Analytics, Facebook, Google, Amazon). These places could have access to many different data sets relating to a person, and could potentially combine these data sets to uniquely identify a single individual.

I recently looked at something that I gave a fake zip, birth date, and gender. Based on statistical probabilities it gave a 68% chance of a large data set having 1-anonymity. Wasn't clear what they were considering large, so could be bogus, but if true imagine what could easily be done with 10+ unique fields (e.g., zip, birthdate, gender, married?, # of children, ever smoked?, deductible amount, diabetes?, profession, BMI).

The earlier poster is right, only aggregate data is truly anonymous.

[go to top]