zlacker

[return to "Tell HN: We should start to add “ai.txt” as we do for “robots.txt”"]
1. TechBr+Nr[view] [source] 2023-05-10 14:36:37
>>Jeanne+(OP)
If AI is using training data from your site, presumably it got that data by crawling it. So either it's already respecting robots.txt, in which case ai.txt would be redundant, or it's ignoring it, in which case there's no reason to expect it would respect ai.txt any more than it did robots.txt.
◧◩
2. zeroto+qw[view] [source] 2023-05-10 14:55:48
>>TechBr+Nr
robots.txt is about crawling, ai.txt would assumably be either augmentative metadata or specific copyright terms of use with respect to AI uses.
◧◩◪
3. LawTal+OQ[view] [source] 2023-05-10 16:20:21
>>zeroto+qw
> specific copyright terms of use

There's no such thing. Without a license you can't enforce any restrictions.

AI training is basically just building a very complex Markov chain, that's obviously not copyright violation because the output product doesn't contain the input - only data about it. If your text has been copied then please point to it in these weights here.

◧◩◪◨
4. throwa+691[view] [source] 2023-05-10 17:40:52
>>LawTal+OQ
markov, shmarkov, either you need those original works or you don't. If you can build your markov chain without them please go ahead

But we all know without these original works such a tool cannot exist in principle, the works are the key ingredient, so now please explain how we are not looking at these works being exploited commercially and copyright being violated.

The output product is an automatically created derivative work, copyright very much applies especially since the tool is used to generate derivative works for profit (like in case of openai/microsoft).

[go to top]