zlacker

[return to "Ask HN: What'd you do while HN was down?"]
1. margin+37[view] [source] 2022-07-08 21:04:35
>>quicks+(OP)
A while back I changed my search engine's crawl data to be ZSTD compressed JSON. It's a bit finnicky to work with, but I'm beginning to realize just how powerful this is.

Could literally just do

  find -name \*.zstd -exec zstdcat {} \; |
    jq 'first(select(.doc|select(.!=null)|.[].headers|select(.!=null)|test("[xX]-[aA]dblock-[kK]ey")))'
and it spewed out samples of domains with a header like X-Adblock-Key. (I'm not great with JQ, so there's probably a better way of doing this, but this unga bunga approach works too)

Specifically, today I did some research on a few tags and headers supposedly associated with "Acceptable Ads" (a standard for showing ads through complicit adblockers), and ended up with a fairly reliable fingerprint for a network of domain squatters that have been a nuisance in my search engine database. Turns out they're basically the only ones that use the headers and tags I was looking at, so now I'm onto their IP-ranges as well.

◧◩
2. Bonobo+4D[view] [source] 2022-07-08 23:12:14
>>margin+37
So the Domain squatter pays to let Adblockers show ads on their sites?
[go to top]