Robots.txt has failed as a system, if it hadn't we wouldn't have captchas or Cloudflare.
In the age of AI we need to better understand where copyright applies to it, and potentially need reform of copyright to align legislation with what the public wants. We need test cases.
The thing I somewhat struggle with is that after 20-30 years of calls for shorter copyright terms, lesser restrictions on content you access publicly, and what you can do with it, we are now in the situation where the arguments are quickly leaning the other way. "We" now want stricter copyright law when it comes to AI, but at the same time shorter copyright duration...
In many ways an ai.txt would be worse than doing nothing as it's a meaningless veneer that would be ignored, but pointed to as the answer.
Meanwhile, now that the laws are inconvenient for them, tech companies are straight up ignoring labeling their training data to respect IP law. Labeling the data would be expensive, thereby eroding profits. The loss of usable data would also harm the efficacy of their models, and the time spent classifying the data will hamper their iteration time.
The ideas are only dissonant if you are looking at the trees (copyright term, DMCA, right to repair, etc.) and not the forest: which is a class struggle between a few thousand billionaires versus the rest of humanity.