A Developer Accidentally Found CSAM in AI Data. Google Banned Him for It

>>markat+(OP)
This raises an interesting point. Do you need to train models using CSAM so that the model can self-enforce restrictions on CSAM? If so, I wonder what moral/ethical questions this brings up.

>>giantg+19
It's a delicate subject but not an unprecedented one. Automatic detection of already known CSAM images (as opposed to heuristic detection of unknown images) has been around for much longer than AI, and for that service to exist someone has to handle the actual CSAM before it's reduced to a perceptual hash in a database.

Maybe AI-based heuristic detection is more ethically/legally fraught since you'd have to stockpile CSAM to train on, rather than hashing then destroying your copy immediately after obtaining it.

>>jshear+1a
> Maybe AI detection is more ethically fraught since you'd need to keep hold of the CSAM until the next training run,

why?

the damage is already done

>>tcfhgj+Ag
I would think there's more possibility of it leaking or being abused in a giant stockpile. Undoubtedly, those training sets would be commercialized in some way, potentially causing some to see that as adding insult to injury.

zlacker