I think you’re right, and it’s unlikely that we (society) will convince people to label their AI content as such so that scraping is still feasible.
It’s far more likely that companies will be formed to provide “pristine training sets of human-created content”, and quite likely they will be subscription based.