zlacker

[return to "We’ve filed a law­suit chal­leng­ing Sta­ble Dif­fu­sion"]
1. dr_dsh+12[view] [source] 2023-01-14 07:17:25
>>zacwes+(OP)
“Sta­ble Dif­fu­sion con­tains unau­tho­rized copies of mil­lions—and pos­si­bly bil­lions—of copy­righted images.”

That’s going to be hard to argue. Where are the copies?

“Hav­ing copied the five bil­lion images—with­out the con­sent of the orig­i­nal artists—Sta­ble Dif­fu­sion relies on a math­e­mat­i­cal process called dif­fu­sion to store com­pressed copies of these train­ing images, which in turn are recom­bined to derive other images. It is, in short, a 21st-cen­tury col­lage tool.“

“Diffu­sion is a way for an AI pro­gram to fig­ure out how to recon­struct a copy of the train­ing data through denois­ing. Because this is so, in copy­right terms it’s no dif­fer­ent from an MP3 or JPEG—a way of stor­ing a com­pressed copy of cer­tain dig­i­tal data.”

The examples of training diffusion (eg, reconstructing a picture out of noise) will be core to their argument in court. Certainly during training the goal is to reconstruct original images out of noise. But, do they exist in SD as copies? Idk

◧◩
2. yazadd+X3[view] [source] 2023-01-14 07:43:18
>>dr_dsh+12
> That’s going to be hard to argue. Where are the copies?

In fairness, Diffusion is arguably a very complex entropy coding similar to Arithmetic/Huffman coding.

Given that copyright is protectable even on compressed/encrypted files, it seems fair that the “container of compressed bytes” (in this case the Diffusion model) does “contain” the original images no differently than a compressed folder of images contains the original images.

A lawyer/researcher would likely win this case if they re-create 90%ish of a single input image from the diffusion model with text input.

◧◩◪
3. magnat+h5[view] [source] 2023-01-14 07:59:23
>>yazadd+X3
And how that's different from gzip or base64, which can re-create original image when given appropriate input?
◧◩◪◨
4. yazadd+D9[view] [source] 2023-01-14 08:43:38
>>magnat+h5
That’s my point, Diffusion[1] does seem to be “just like” gzip or base64.

And it would be illegal for me to sell or distribute zipped copies of images without the copyright holder’s consent. Similarly there might be an argument for why Diffusion[1] specifically can’t be built with copyrighted images.

[1] which is just one part of something like Stable Diffusion

◧◩◪◨⬒
5. astran+ka[view] [source] 2023-01-14 08:51:21
>>yazadd+D9
A lossy compressor isn't just like a lossless compressor. Especially not one that has ~2 bytes for each input image.
◧◩◪◨⬒⬓
6. synu+7b[view] [source] 2023-01-14 08:59:58
>>astran+ka
How many bytes make it an original work vs a compressed copy?
◧◩◪◨⬒⬓⬔
7. astran+Sd[view] [source] 2023-01-14 09:33:32
>>synu+7b
Usually judges would care more about whether the bytes came from than how many of them there are.

Since SD is trained by gradient updating against several different images at the same time, it of course never copies any image bits straight into it. Since it's a latent-diffusion model, actual "image"ness is limited to the image encoder (VAE), so any fractional bits would be in there if you want to look.

The text encoder (LAION OpenCLIP) does have bits from elsewhere copied straight into it to build the tokens list.

https://huggingface.co/stabilityai/stable-diffusion-2-1/raw/...

◧◩◪◨⬒⬓⬔⧯
8. derang+961[view] [source] 2023-01-14 17:45:48
>>astran+Sd
“any fractional bits would be in there if you want to look.”

What do you mean by this in the context of generating images via prompt? “Fractional bits” don’t make sense and it’s more misleading if anything. Regardless, a model violating criteria for being within fair use will always be judged by the outputs it generates rather than its composing bytes (which can be independent)

[go to top]