How do you differentiate an AI crawler from a normal crawler? Almost all of the LLMs are trained on commoncrawl, which the concept of LLMs didn't even exist when CC started. What about a crawler that creates a search database, but's context is fed into a LLM as context? Or a middleware that fetches data in real time?
Honestly that's a terrible idea. and robots.txt can cover the use cases. But is still pretty ineffective, because it's more just a set of suggestions than rules that must be followed.