zlacker

[return to "Google to explore alternatives to robots.txt"]
1. dbette+u2[view] [source] 2023-07-08 06:04:58
>>skille+(OP)
I notice they don't actually give a good reason that robots.txt isn't suitable.

Change for the sake of it?

◧◩
2. vore+X2[view] [source] 2023-07-08 06:09:15
>>dbette+u2
To steelman this maybe, I think they’re angling for something like a mechanism to indicate content is OK to index but not OK to use as AI training data. Maybe you could fudge it today with user agents in robots.txt but who knows what the concrete idea of this is.
◧◩◪
3. varenc+f6[view] [source] 2023-07-08 06:50:33
>>vore+X2
robots.txt is already outmoded. It only can indicate that content can’t be crawled but a URL marked this way can still be indexed. As Google says “it is not a mechanism for keeping a web page out of Google” [0] You need to use other things besides robots.txt to preventing indexing.

[0] https://developers.google.com/search/docs/crawling-indexing/...

[go to top]