zlacker

I think this summarizes it pretty well. Even if you don't mind the garbage, the future AI will feed on this garbage, creating AI and human brain gray goo.

https://ploum.net/2022-12-05-drowning-in-ai-generated-garbag...

https://en.wikipedia.org/wiki/Gray_goo

replies(1): >>nickpp+4b

>>Mistle+(OP)
Is this a real problem model trainers actually face or is it an imagined one? The Internet is already full of garbage - 90% of the unpleasantness of browsing these days is filtering through mounts and mounds of crap. Some is generated, some is written, but still crap full of wrong and lies.

I would've imagined training sets were heavily curated and annotated. We already know how to solve this problem for training humans (or our kids would never learn anything useful) so I imagine we could solve it similarly for AIs.

In the end, if it's quality content, learning it is beneficial - no matter who produced it. Garbage needs to be eliminated and the distinction is made either by human trainers or already trained AIs. I have no idea how to train the latter but I am no expert in this field - just like (I suspect) the author of that blog.