zlacker

[return to "Google to explore alternatives to robots.txt"]
1. voytec+V3[view] [source] 2023-07-08 06:20:02
>>skille+(OP)
Seems like it's intended for content stealing from every place that doesn't immediately implement Google's New Web Order as an addition to robots.txt.

"Your do not enter sign uses font we don't like, so we'll just ignore it"

◧◩
2. Ferret+fh[view] [source] 2023-07-08 09:04:59
>>voytec+V3
To be clear, robots.txt is not legally binding, Google is not bound to follow it, and in fact I believe that Google doesn't follow it and hasn't for a very long time, for the simple reason that many sites' robots.txt file is wrong.

The intent of robots.txt is to help crawlers, for example, to keep crawlers from getting stuck in a recursive loop of dynamic pages, or from crawling pages with no value. robots.txt is not for banning, restricting, or hindering crawlers.

◧◩◪
3. lisasa+G31[view] [source] 2023-07-08 16:05:45
>>Ferret+fh
for the simple reason that many sites' robots.txt file is wrong.

Which is of course not the real reason.

The reason Google doesn't follow the robots.txt protocol is (1) they don't want to (2) they can get away with it.

[go to top]