zlacker

Can't they pull the data from archive.org?

replies(3): >>notaco+a1 >>KuiN+r6 >>SllX+B9

>>ameliu+(OP)
That would be worse.

>>ameliu+(OP)
Archive.org was knocked offline the other day due to some AI startup scraping it to death. It’s not a good thing.

replies(1): >>moneyw+Fx

>>ameliu+(OP)
Archive.org is a non-profit without the capacity to serve that many requests. An excellent resource for people to use carefully, but not a treasure trove for bots to scrape down to the last bit.

replies(1): >>notpus+si1

>>KuiN+r6
Source, they don’t rate limit

replies(3): >>Kon-Pe+gz >>pipers+zF >>edgyqu+5L

>>moneyw+Fx
https://news.ycombinator.com/item?id=36110527

>>moneyw+Fx
True - and their lack of rate limiting ended up letting someone overwhelm their servers, knocking them offline.

>>moneyw+Fx
They put out a blog asking people not to scrape afterwards. A simple google will be much fast than asking for sources.

>>SllX+B9
Would be cool if they introduce some reasonably priced access for mass scrapers. Should make some nice income in addition to donations, and a valuable service to community.