zlacker

[return to "Google’s nightmare “Web Integrity API” wants a DRM gatekeeper for the web"]
1. rpdill+sl[view] [source] 2023-07-24 23:02:21
>>jakobd+(OP)
I've been thinking about this for a few days but just realized that this is a complete end run around all web scraping in general.

All 'adversarial compatibility' from projects like Nitter, Teddit, Invidious, and youtube-dl go out the window. Any archive site (archive.org, archive.ph, etc.) can be blocked by sites requiring attestation.

And just like the book industry was terrified of piracy and were 'rescued' by Kindle, so too will journalism outlets that can't find a business model flock to Google to save them.

This is going to be rough.

◧◩
2. userbi+dt[view] [source] 2023-07-25 00:02:10
>>rpdill+sl
Any archive site (archive.org, archive.ph, etc.) can be blocked by sites requiring attestation.

What will happen if such a thing actually happens is that the underground market for "trusted device" farms grows, not too different from what's currently already happening but possibly at a far larger scale. Of course, that means the financially motivated scraping services still keep going while the honest individuals wanting user-agent freedom get screwed, just like with many other forms of DRM...

◧◩◪
3. wrapti+tL[view] [source] 2023-07-25 02:17:47
>>userbi+dt
This has been happening already. The market is trying really hard to price out web scraping through scraper detection technologies and it's kinda working - scraping is becoming non-existent in user-space apps. It's also extremely discriminatory. Try running a single scrape with a developing country's IP and Linux, you'll be blocked at TLS step lol
◧◩◪◨
4. CalRob+G31[view] [source] 2023-07-25 05:12:24
>>wrapti+tL
But of course search engines are fine
◧◩◪◨⬒
5. wrapti+vj1[view] [source] 2023-07-25 07:36:25
>>CalRob+G31
Having your cake and eating it too is a natural goal of every business and honestly it was just a matter of time till web pages figured out they can have the benefits of public data and avoid the costs. Web scraping and botting is basically a solved problem too - just put a login gate for the data which allows you to legally litigate against scrapers and bots. Done. However, nobody wants to lose the benefits of public data so here we are.
[go to top]