zlacker

> I notice they don't actually give a good reason that robots.txt isn't suitable

It's kind of implied: specifying sitemaps/allowance/copyright for different use cases: search, scraping, republishing, training etc. and perhaps adding some of the non standard extensions: Crawl-delay, default host, even sitemap isn't part of the robots.txt standard

> We believe it’s time for the web and AI communities to explore additional machine-readable means for web publisher choice and control for emerging AI and research use cases.