<Dilbert looks back with a blank stare>
---
Godspeed Scott. Thank you for all the laughs.
I've seen the pattern repeat with other data collection as well -- "anonymous" data collection or "anonymized" data almost never is.
It's been a fun exercise in software architecture. Because I actually care about this.
But we keep pushing this annual survey another year since we never seem to be ready to actually implement it (due to other priorities)
The thing is, as soon as you allow free-text entry, the exercise becomes moot assuming you got a solid training corpus of emails to train an AI on - basically the same approach that Wikipedia activists used to do two decades ago to determine "sockpuppet" accounts.
So on the card I provided with my gift, I signed off the name of someone else in class, and partially erased it. Made sure it was still somewhat legible and then wrote "From your secret santa" beneath it.
They didn't believe the gift was from me even after the teacher provided them with the original draw, and their supposed gift giver identified someone else as their recipient.
Over the course of 4 years I think it was only used 3 times. Most people assumed it was some kind of trap. It wasn’t, I genuinely wanted honest feedback, and thought some people were too shy to speak up in a group setting, so wanted to give options.
Management can 'drill down' to get information on how specific teams responded.
One of the things they mentioned doing is using a statistical (differential privacy?) model to limit the depth, to prevent any specific persons responses being revealed unless it was shared with a substantial number of other responses.
Surprisingly difficult when you consider e.g. a team lead reading a statement like "of the 10 people in your team, one is highly dissatisfied with management" - they have personal knowledge of the situation and are going to know which person it is.
After some shuffling at work, I ended up spending some time under an awful manager. She approached me after an anonymous round of feedback and said "I noticed you wrote _____." I had, in fact, not written that.
On some level, having her guess wrong seemed even worse, but it also felt nice to be able to honestly say "I did not." Hopefully taught her to respect anonymity next time.
They later decided to adopt it for an annual IT satisfaction survey that they sent out to users. In an ideal world we wouldn't participate because the respondents were grading my team's performance but we got invites because we were part of the Exchange distro the message was sent to. I quickly discovered that the dev team had left a bunch of default routes enabled so we were able to view a list of all responses and see who submitted which. We knew our customers well enough that we could reliably attribute most of the negative responses via the free-text comments field anyhow but the fact that anybody could explicitly see everybody else's response wasn't great.
I suppose the NTLM-authenticated username in the server logs would convey the same info but at least that'd require CIFS/RDP access to the web server...
In reality, they cherry picked the questions that they wanted to talk about and ignored the hard ones. We could tell because all asked questions were publicly visible in the app. But not all answered “ah we’re out of time”
So I once posted a question about why were the interns unpaid while writing code we shipped in production. I posted this question just after the previous town hall so that it would stay visible in the app for the longest time until the next town hall and would also be top of the list of pending questions.
For a couple weeks they said they wanted to answer it but needed to ask clarification questions to make sure they understood correctly, so could please the asker reveal themselves as it’s only fair. I never said it was me and nobody said it was them either. They couldn’t just delete the question like they usually did with unanswered questions before as this had stirred quite a little storm between employees. And it would clash with the “we’re open and fair” koolaid they were serving us.
Eventually, they deleted the question without annswering it “since the asker doesn’t have the courage to reveal themselves” and I was laid off which was “totally unrelated to the question you asked”.
Before leaving I dumped the database for that app out of curiosity. You bet that every single question also had an entry of who asked which question. They knew all along.
In most of the places I've worked, I would have assumed the same.
The thing is that there is no real technological solution that would instill trust in someone that doesn't already have trust. In the end, all such privacy solutions necessarily must boil down to "trust us" because it's not practical or reasonable to perform the sort of deep analysis that would be required to confirm privacy claims.
You may have provided the source, for instance, but that doesn't give reassurance that the binary that is executing was compiled from that source.