zlacker

[return to "Algorithm can pick out almost any American in supposedly anonymized databases"]
1. rzwits+e6[view] [source] 2019-07-24 10:58:05
>>zoobab+(OP)
I'm a programmer in the GP data analysis world. We use the term 'pseudonymization' for this kind of data. 'Anonymization' is used solely to refer to, say, 'the sum total of diabetes patients this practice has' (that would be anonymous patient data; it would not be anonymous relative to the GP office this refers to): Aggregated data that can no longer be reduced to a single individual at all.

The term raises questions: Okay, so, what does it mean? How 'pseudo' is psuedo? And that's the point: When you pseudonimize data, you must ask those questions and there is no black and white anymore.

My go-to example to explain this is very simple: Let's say we reduce birthdate info to just your birthyear, and geoloc info to just a wide area. And then I have an pseudonimized individual who is marked down as being 105 years old.

Usually there's only one such person.

I invite everybody who works in this field to start using the term 'pseudonimization'.

◧◩
2. cheez+A6[view] [source] 2019-07-24 11:03:18
>>rzwits+e6
It doesn't roll off the tongue, perhaps pseudo-anonymization is enough.
◧◩◪
3. gervu+B8[view] [source] 2019-07-24 11:27:51
>>cheez+A6
Pseudonymization already refers to reference by pseudonym.

Pseudonimization is bad terminology in that it's indistinct from the above, to the point that parent has already mixed the two up while in the process of recommending it. And it'd be worse verbally.

"Pseudo-anonymization" could work, but something like "breakable anonymization" or "partial anonymization" might be better in that it's more obvious to a reader and doesn't rely on familiarity with technical terminology to convey the idea.

I'd go with breakable, myself, since it's most to the point about why it's a problem.

Pseudo is etymologically correct, but that doesn't necessarily help us much when the goal is ratio and ease of understanding by a wide population of readers.

Partial could work in the sense that you did part of the job, which people would hopefully understand is a bit like having locked the back door for the night while leaving the front propped wide open.

And there are probably other good options. If I was writing about this topic often, I'd strongly consider brainstorming a few more and running a user test where I ask random people to explain each term, then go with what consistently gets results closest to what I'm trying to discuss.

◧◩◪◨
4. washad+Tb[view] [source] 2019-07-24 11:59:07
>>gervu+B8
'Pseudononymization' is how my brain read it. It's a fairly self-explanatory portmanteau with little chance of confusion on the root.
◧◩◪◨⬒
5. gervu+kc[view] [source] 2019-07-24 12:03:45
>>washad+Tb
"Pseudonymization but with an extra syllable" sounds just as confusing as the other two. I wouldn't be sure which of those words I was looking at on first glance, which is what you'd need for it to be casually readable.
[go to top]