zlacker

[parent] [thread] 2 comments
1. varenc+(OP)[view] [source] 2023-07-08 06:50:33
robots.txt is already outmoded. It only can indicate that content can’t be crawled but a URL marked this way can still be indexed. As Google says “it is not a mechanism for keeping a web page out of Google” [0] You need to use other things besides robots.txt to preventing indexing.

[0] https://developers.google.com/search/docs/crawling-indexing/...

replies(1): >>dazc+K2
2. dazc+K2[view] [source] 2023-07-08 07:25:16
>>varenc+(OP)
Indeed, having pages indexed which can't then be crawled is a great way of shooting yourself in the foot.
replies(1): >>floomk+w21
◧◩
3. floomk+w21[view] [source] [discussion] 2023-07-08 16:32:31
>>dazc+K2
I think you meant it's a great way for google to punish you for not giving them full access
[go to top]