We’ve filed a lawsuit challenging Stable Diffusion

>>zacwes+(OP)
“Stable Diffusion contains unauthorized copies of millions—and possibly billions—of copyrighted images.”

That’s going to be hard to argue. Where are the copies?

“Having copied the five billion images—without the consent of the original artists—Stable Diffusion relies on a mathematical process called diffusion to store compressed copies of these training images, which in turn are recombined to derive other images. It is, in short, a 21st-century collage tool.“

“Diffusion is a way for an AI program to figure out how to reconstruct a copy of the training data through denoising. Because this is so, in copyright terms it’s no different from an MP3 or JPEG—a way of storing a compressed copy of certain digital data.”

The examples of training diffusion (eg, reconstructing a picture out of noise) will be core to their argument in court. Certainly during training the goal is to reconstruct original images out of noise. But, do they exist in SD as copies? Idk

>>dr_dsh+12
You could make the same argument that as long as you are using lossy compression you are unable to infringe on copyright.

>>synu+H4
That's a huge understatement. 5 billion images to a model of 5GB. 1 byte per image. Let's see if one byte per image would constitute a copyright violation in other fields than neural networks.

>>visarg+h6
It will be interesting to see how they legally define the moment where compression stops being compression and starts being an original work.

If I train on one image I can get it right back out. Even two, maybe even a thousand? Not sure what the line would be where it becomes ok vs not but there will have to be some answer.

>>synu+x6
There only needs to be an answer if it's determined that some number isn't copyright infringement. The easy answer would be to say that the process is what prevents the works from being transformative(and thus copyrightable) and not the size of the training set.

zlacker