zlacker

[return to "Justice Department withdraws FBI subpoena for USA Today records ID'ing readers"]
1. xvecto+W4[view] [source] 2021-06-05 22:32:49
>>lxm+(OP)
I wish services didn't store IPs at all.

If abuse is an issue, why not hash the IP with a nonce?

◧◩
2. nullc+O7[view] [source] 2021-06-05 23:00:43
>>xvecto+W4
There are only 2^32 IPv4 addresses, if you know the nonce you just try them all... no privacy provided.

If you don't know the nonce, you can't match against other users-- so not useful for abuse.

But I'm skeptical re: abuse uses. For commenters, sure-- you may need to store IPs to combat abuse. But for readers? At most you would need sampled data or in-memory counters (e.g. to catch high volume bots).

Unfortunately, there really isn't any penalty for failing to minimize private data collection.

◧◩◪
3. xvecto+Fa[view] [source] 2021-06-05 23:31:42
>>nullc+O7
If you use a difficult hash function that takes ~1 seconds to calculate then it would take over 120 years to iterate through the IPv4 address space. At the very least, this could cut down on dragnet surveillance
◧◩◪◨
4. gizmo6+Oc[view] [source] 2021-06-05 23:59:31
>>xvecto+Fa
This requires that you add ~1 second of latency to every request that requires you to hash the IP. Even if we assume relatively aggressive caching, this is still incredibly unacceptable from a user experience perspective.

Assuming you do that, you are looking at about 1193046 hours to hash the entire address space. More specifically, you are looking at 1193046 CPU hours.

You can rent a 96 vCPU c5.24xlarge instance from AWS for a rate of $4.08/hour; or $0.0425/CPU-Hour. Assuming this offers the same per-cpu hashrate as the general purpose web-server, you are looking at a cost of $50,704 to construct a rainbow table. That is no where near a prohibitive sum of money.

You can probably reduce the cost by shopping around for compute or using bare metal. You could see significant cost reductions by using hashing optimized ASICs.

Combine this with the fact that no website is going to spend 1000ms just computing the hash for every request (even if you allow for caching). And the fact that they can probably narrow down the address space they are interested in considerably if they wanted to save money.

2^32 is just too small of an asymmetry between legitimate use and an attack to be a viable defense.

◧◩◪◨⬒
5. xvecto+ie[view] [source] 2021-06-06 00:15:49
>>gizmo6+Oc
From a user experience perspective, you can perform the computation asynchronously. There are also hash algorithms resistant to ASIC.

But yeah, everything else you said makes sense.

◧◩◪◨⬒⬓
6. gizmo6+jj[view] [source] 2021-06-06 01:11:58
>>xvecto+ie
And now you have a ~1000ms latency between when some events happen, and when you can log them. Even assuming all such events get logged, you will be left with a jumbled mess of out-of-order events.
[go to top]