I think you’re right, and it’s unlikely that we (society) will convince people to label their AI content as such so that scraping is still feasible.
It’s far more likely that companies will be formed to provide “pristine training sets of human-created content”, and quite likely they will be subscription based.
well, we do have organic/farmed/handcrafted/etc. food. One can imagine information nutrition label - "contains 70% AI generated content, triggers 25% of the daily dopamine release target".