zlacker

[return to "We’ve filed a law­suit chal­leng­ing Sta­ble Dif­fu­sion"]
1. dr_dsh+12[view] [source] 2023-01-14 07:17:25
>>zacwes+(OP)
“Sta­ble Dif­fu­sion con­tains unau­tho­rized copies of mil­lions—and pos­si­bly bil­lions—of copy­righted images.”

That’s going to be hard to argue. Where are the copies?

“Hav­ing copied the five bil­lion images—with­out the con­sent of the orig­i­nal artists—Sta­ble Dif­fu­sion relies on a math­e­mat­i­cal process called dif­fu­sion to store com­pressed copies of these train­ing images, which in turn are recom­bined to derive other images. It is, in short, a 21st-cen­tury col­lage tool.“

“Diffu­sion is a way for an AI pro­gram to fig­ure out how to recon­struct a copy of the train­ing data through denois­ing. Because this is so, in copy­right terms it’s no dif­fer­ent from an MP3 or JPEG—a way of stor­ing a com­pressed copy of cer­tain dig­i­tal data.”

The examples of training diffusion (eg, reconstructing a picture out of noise) will be core to their argument in court. Certainly during training the goal is to reconstruct original images out of noise. But, do they exist in SD as copies? Idk

◧◩
2. yazadd+X3[view] [source] 2023-01-14 07:43:18
>>dr_dsh+12
> That’s going to be hard to argue. Where are the copies?

In fairness, Diffusion is arguably a very complex entropy coding similar to Arithmetic/Huffman coding.

Given that copyright is protectable even on compressed/encrypted files, it seems fair that the “container of compressed bytes” (in this case the Diffusion model) does “contain” the original images no differently than a compressed folder of images contains the original images.

A lawyer/researcher would likely win this case if they re-create 90%ish of a single input image from the diffusion model with text input.

◧◩◪
3. anothe+96[view] [source] 2023-01-14 08:08:50
>>yazadd+X3
Great. Now the defence shows an artist that can recreate an image. Cool, now people who look at images get copyright suits filed against them for encoding those images in their heads.
◧◩◪◨
4. smusam+R11[view] [source] 2023-01-14 17:12:42
>>anothe+96
Don't think stable Diffusion can reproduce any single image its trained on, not matter what prompts you use.

It does have Mona lisa because of over fitting. But that's because there is too much Mona lisa on internet.

These artist taking part in suit won't be able to recreat any of their work.

◧◩◪◨⬒
5. neuah+I63[view] [source] 2023-01-15 14:57:33
>>smusam+R11
Does SD have to recreate the entire image for it to violate copyright?

As a thought experiment, imagine a variant of something like SD was used for music generation rather than images. It was trained on all music on spotify and it is marketed as a paid tool for producers and artists. If the model reproduces specific sounds from certain songs, e.g. the specific beat from a song, hook, or melody, it would seem pretty straightforward that the generated content was derivative, even though only a feature of it was precisely reproduced. I could be wrong but as far as i am aware you need to get permission to use samples. Even if the content is not published those sounds are being sold by the company as inspiration, and therefore that should violate copyright. The training data is paramount because if you trained the model on stuff you generated yourself or on stuff with appropriate CC license, the resulting work would not violate copyright, or you could at least argue independent creation.

In the feature space of images and art, SD is doing something very similar, so i can see the argument that it violates copyright even without reproducing the whole training data.

Overall, i think we will ultimately need to decide how we want these technologies used, what restrictions should be on the training data, etc, and then create new laws specifically for the new technology, rather than trying to shoehorn it into existing copyright law.

◧◩◪◨⬒⬓
6. smusam+B95[view] [source] 2023-01-16 07:00:31
>>neuah+I63
Do you know that the final trained model is only 2GB? There is no way it can reproduce anything verbatim. There is also Riffusion that can generate music after being trained on FFTs of music.
[go to top]