You know, if I've noticed anything in the past couple years, it's that even if you self-host your own site, it's still going to get hoovered up and used/exploited by things like AI training bots. I think between everyone's code getting trained on, even if it's AGPLv3 or something similarly restrictive, and generally everything public on the internet getting "trained" and "transformed" to basically launder it via "AI", I can absolutely see why someone rational would want to share a whole lot less, anywhere, in an open fashion, regardless of where it's hosted.
I'd honestly rather see and think more about how to segment communities locally, and go back to the "fragmented" way things once were. It's easier to want to share with other real people than inadvertently working for free to enrich companies.
Then you carefully log what LLM user-agents/IPs go past that agree, along with some very distinctive secretly crawlable pages which have contents that can be distinctively reproduced back out of the model if needed.
Then whenever SomeShittyLLM posts "articles", everybody with that TOS that was crawled gets to duplicate it without ads for free. :P