zlacker

What if we create a new access.txt which all user agents will use to get access to the resources.

access.txt will return an individual access key for the user agent like a session, and the user agent can only crawl using the access key

This would mean that we could standardize session starts with rate limits. Regular user is unlikely to hit the user rate limits, but bots would get rocked by rate limiting.

Great. Now authorized crawlers, bing, google, etc, all use PKI so that they can sign the request to access.txt to get their access key. If the access.txt request is signed with a known crawler the rate limits can be loosened to levels that a crawler will enjoy

This will allow users / browsers to use normal access patterns without any issue, but crawlers will have to request elevated rate limits to perform their tasks. Crawlers and AI alike could be allowed or disallowed by the service owners, which is really what everyone wanted from robots.txt in the first place

One issue I see with this already is that it solidifies the existing search engines as the market leaders

replies(1): >>r3troh+F1

>>arwine+(OP)
I might not understand you, but what prevents me from conducting a Sybil attack (a.k.a. a sock puppet attack) against this system?

Seems like it relies on everyone playing by the rules and only requesting one license per user. Why would a bot developer be incentivized to follow that rule and not just request 1M licenses?

replies(1): >>arwine+Cy

>>r3troh+F1
That's a great point thank you for bringing this up. I don't have a solution for that, and frankly my proposal was putting a lot of work onto the platforms that wanted to support it so I'm not sure it would get much traction.