zlacker

[return to "Tell HN: We should start to add “ai.txt” as we do for “robots.txt”"]
1. samwil+H5[view] [source] 2023-05-10 12:56:05
>>Jeanne+(OP)
Using robots.txt as a model for anything doesn't work. All a robots.txt is is a polite request to please follow the rules in it, there is no "legal" agreement to follow those rules, only a moral imperative.

Robots.txt has failed as a system, if it hadn't we wouldn't have captchas or Cloudflare.

In the age of AI we need to better understand where copyright applies to it, and potentially need reform of copyright to align legislation with what the public wants. We need test cases.

The thing I somewhat struggle with is that after 20-30 years of calls for shorter copyright terms, lesser restrictions on content you access publicly, and what you can do with it, we are now in the situation where the arguments are quickly leaning the other way. "We" now want stricter copyright law when it comes to AI, but at the same time shorter copyright duration...

In many ways an ai.txt would be worse than doing nothing as it's a meaningless veneer that would be ignored, but pointed to as the answer.

◧◩
2. brooks+G8[view] [source] 2023-05-10 13:10:27
>>samwil+H5
> Robots.txt has failed as a system, if it hadn't we wouldn't have captchas or Cloudflare.

Failing to solve every problem does not mean a solution is a failure.

From sunscreen to seatbelts, the world is full of great solutions that occasionally fail due to statistics and large numbers.

◧◩◪
3. usrusr+Ci[view] [source] 2023-05-10 13:57:30
>>brooks+G8
That's still not an argument to introduce ai.txt, because everything a hypothetical ai.txt could ever do is already done just as good (or not) by the robots.txt we have. If a training data crawler ignores robots.txt it won't bother checking for an ai.txt either.

And if you feel like rolling out the "welcome friend!" doormat to a particular training data crawler, you are free to dedicate as detailed a robots.txt block as you like to its user agent header of choice. No new conventions needed, everything is already on place.

◧◩◪◨
4. michae+pz[view] [source] 2023-05-10 15:08:53
>>usrusr+Ci
This seems to be assuming a very different purpose for ai.txt than the OP proposed. It sounds like they are intending ai.txt to give useful contextual information to crawlers collecting AI training data. Robots.txt does not have any of this information (although I suppose you could include it in comments).
[go to top]