We’ve filed a lawsuit challenging Stable Diffusion

>>zacwes+(OP)
Sometimes I have to wonder about the hypocrisy you can see on HN threads. When its software development, many here seem to understand the merits of a similar lawsuit against Copilot[1], but as soon as its a different group such as artists, then it's "no, that's not how a NN works" or "the NN model works just the same way as a human would understand art and style."

[1] https://news.ycombinator.com/item?id=34274326

>>fruit2+r3
Computerphile has friendly introductions to just about everything: https://youtu.be/1CIpzeNxIhU

>>blulul+Q3
When you look around, you don't actually see most of stuff that's right there... you've got a small area you can see, about the size of your outstretched thumb[1], the rest of the world you think you're seeing is a projection in your mind, built as your eye darts from object to object.

Thus, when we see things, we have already built a relationship map of the parts of an image, not actual pixels. This makes it possible to observe the world and interact with it in real time referencing the pieces and the concepts we label them with, otherwise we'd have to stop and very carefully look around every single time we wanted to take a step.

These networks effectively do the same thing, taking in parts of images and their relationships. It's not uncommon for me to see what is clearly a distorted copy of a Getty Images trademark when I run stable diffusion locally. There's an artist who always puts his daughter Nina's name in his work... the network just thinks its just another style, and I suspect that's same for the Getty thing.

One thing that is super cool is you can draw a horribly amateur sketch of something, and have Stable Diffusion turn it into something close to the starting drawing in outline, but far better in detail.

A sketch of a flower I did came out as Tulips, Roses and Poppy depending on the prompts used to process it, but it was generally in the same pose and scale.

[1] https://developer.tobii.com/xr/learn/eye-behavior/visual-ang...

>>Traube+42
Lord Byron (father of Ada Lovelace) defended the Luddites quite succinctly back then and it is as applicable today: https://www.smithsonianmag.com/smart-news/byron-was-one-few-...

"But the police, however useless, were by no means idle: several notorious delinquents had been detected; men liable to conviction, on the clearest evidence, of the capital crime of poverty; men, who had been nefariously guilty of lawfully begetting several children, whom, thanks to the times!—they were unable to maintain. Considerable injury has been done to the proprietors of the improved frames. These machines were to them an advantage, inasmuch as they superseded the necessity of employing a number of workmen, who were left in consequence to starve. By the adoption of one species of frame in particular, one man performed the work of many, and the superfluous labourers were thrown out of employment."

..

"The rejected workmen, in the blindness of their ignorance, instead of rejoicing at these improvements in arts so beneficial to mankind, conceived themselves to be sacrificed to improvements in mechanism. In the foolishness of their hearts, they imagined that the maintenance and well doing of the industrious poor, were objects of greater consequence than the enrichment of a few individuals by any improvement in the implements of trade which threw the workmen out of employment, and rendered the labourer unworthy of his hire. And, it must be confessed, that although the adoption of the enlarged machinery, in that state of our commerce which the country once boasted, might have been beneficial to the master without being detrimental to the servant;"

..

"These men never destroyed their looms till they were become useless, worse than useless; till they were become actual impediments to their exertions in obtaining their daily bread. Can you then wonder, that in times like these, when bankruptcy, convicted fraud, and imputed felony, are found in a station not far beneath that of your Lordships, the lowest, though once most useful portion of the people, should forget their duty in their distresses, and become only less guilty than one of their representatives? But while the exalted offender can find means to baffle the law, new capital punishments must be devised, new snares of death must be spread, for the wretched mechanic who is famished into guilt. These men were willing to dig, but the spade was in other hands; they were not ashamed to beg, but there was none to relieve them. Their own means of subsistence were cut off; all other employments pre-occupied; and their excesses, however to be deplored and condemned, can hardly be the subject of surprise."

..

"The present measure will, indeed, pluck it from the sheath; yet had proper meetings been held in the earlier stages of these riots,—had the grievances of these men and their masters (for they also have had their grievances) been fairly weighed and justly examined, I do think that means might have been devised to restore these workmen to their avocations, and tranquillity to the country."

>>TheMid+w4
> Are image generators giving exact (or very similar) copies of existing works?

um, yes.[1][2] What else would they be trained on?

According to the model card:

[1] https://github.com/CompVis/stable-diffusion/blob/main/Stable...

it was trained on this data set(which has hyperlinks to images, so feel free to peruse):

[2] https://huggingface.co/datasets/laion/laion2B-en

>>andyba+I4
Because in some cases, adding a style prompt gives almost the original image: https://www.reddit.com/r/StableDiffusion/comments/wby0ob/it_...

>>kstene+s4
The sampling problem is still real. There's a difference in say the cutup of the amen break into 40 parts and painstakingly re-orchestrating it and taking the strings from Led Zeppelin's Kashmir, playing it in a loop for 4:55 and rapping over it (see https://en.m.wikipedia.org/wiki/Smoke_Some_Kill). To be hyper specific, that's why say, Tyree's Acid Overture, which samples the same song and was released the same year and recorded in the same city, didn't see any push back (https://m.youtube.com/watch?v=vJeFIBhZTBE) - it's used more like a paintbrush in that one.

There's very much an arbitrary judgment call here. Bob James probably had a case with Run DMC Peter Piper (which is take me to the Mardi gras) but he's always been really chill about it. They lucked out.

Same thing here. There's this amorphous "creative effort" that creates an abstract distance between the works. Unless the engineers can show an effort to respect and police this distance, I think things might get dicey

>>TheMid+94
https://www.wired.com/2011/01/hope-image-flap/amp

>>rep_mo+A6
There is a somewhat popular lawsuit right now which argues exactly that (and is reported to go into the next instance): https://petapixel.com/2022/12/08/photographer-loses-plagaris...

>>myrryr+z2
I just took the first few sentences from Wiki that describe Mona Lisa [1], pasted it on HuggingFace [2] and got pretty similar results [3].

[1] The Mona Lisa (/ˌmoʊnə ˈliːsə/ MOH-nə LEE-sə; Italian: Gioconda [dʒoˈkonda] or Monna Lisa [ˈmɔnna ˈliːza]; French: Joconde [ʒɔkɔ̃d]) is a half-length portrait painting by Italian artist Leonardo da Vinci. Considered an archetypal masterpiece of the Italian Renaissance,[4][5] it has been described as "the best known, the most visited, the most written about, the most sung about, the most parodied work of art in the world".

[2] https://huggingface.co/spaces/stabilityai/stable-diffusion

[3] https://imgur.com/a/L2LDOS4

EDIT: With the Starry Night it worked even better. But it failed to reproduce the Bathing of a Red Horse (that one doesn't have a page on English wiki, so I took the description from elsewhere).

>>limite+95
Yeah what's considered a copy or not is a grey area. Here's a good example of that: https://news.ycombinator.com/item?id=34378300

But artists have been making "in the style of" works for probably millennia. Fan art is a common example.

I suppose the advent of software that makes it easy to make "in the style of" works will force us to get much more clear on what is and isn't a copy. How exciting.

However, I don't see how the software tool is directly at fault, just the person using it.

>>lolind+i7
Funny thing - the forum works like a language model. It doesn't have one set personality, but it can generate from a distribution of people. The language model can generate from a distribution of prompts, which might be persona descriptions.

> Out of One, Many: Using Language Models to Simulate Human Samples

https://arxiv.org/abs/2209.06899

>>hgomer+78
I think this touches on the core mismatch between the legal perspective and technical perspective.

Yes, on a technical level, those chunks are random data. On the legal side, however, those chunks are illegal copyright infringement because that is their intent, and there is a process that allows the intent to happen.

I can't really say it better than this post does, so I highly recommend reading it: https://ansuz.sooke.bc.ca/entry/23

>>limite+f4
"Copyright currently protects poetry just like it protects any other kind of writing or work of authorship. Poetry, therefore, is subject to the same minimal standards for originality that are used for other written works, and the same tests determine whether copyright infringement has occurred." [1]

[1] https://scholarship.law.vanderbilt.edu/vlr/vol58/iss3/13/

>>realus+G9
There is no need for rhetorical games. The actual issue is that Stable Diffusion does create derivatives of copyrighted works. In some cases the produced images contain pixel level details from the originals. [1]

[1] https://arxiv.org/pdf/2212.03860.pdf

>>dylan6+aa
That is not the point of using the training data. It's specifically trained to not do that.

See https://openai.com/blog/dall-e-2-pre-training-mitigations/ "Preventing Image Regurgitation".

>>IncRnd+ia
> The actual issue is that Stable Diffusion does create derivatives of copyrighted works.

Nothing points to that, in fact even in this website they had to lie on how stablediffusion actually works, maybe a sign that their argument isn't really solid enough.

> [1] https://arxiv.org/pdf/2212.03860.pdf

You realize those are considered defects of the model right? Sure, this model isn't perfect and will be improved.

>>zacwes+(OP)
https://donotpay.com/ should be involved in this case :)

>>headso+J3
> It cheapens the artform.

Everyone make their own artistic judgements, nobody's ideas are better. If people prefer this https://lexica.art/ (scroll down) then that's their right.

>>synu+7b
Usually judges would care more about whether the bytes came from than how many of them there are.

Since SD is trained by gradient updating against several different images at the same time, it of course never copies any image bits straight into it. Since it's a latent-diffusion model, actual "image"ness is limited to the image encoder (VAE), so any fractional bits would be in there if you want to look.

The text encoder (LAION OpenCLIP) does have bits from elsewhere copied straight into it to build the tokens list.

https://huggingface.co/stabilityai/stable-diffusion-2-1/raw/...

>>zacwes+(OP)
https://en.wikipedia.org/wiki/Silesian_weavers%27_uprising

>>Last5D+ph
The law doesn't care about technical tricks. It cares about how you got the bytes and what humans think of them.

Sure, the windows 11 codebase is in pi somewhere if you go far enough. Sure, pi is a non-copyrightable fact of nature. That doesn't mean the windows codebase is _actually_ in pi legally, just that it technically is.

The law does not care about weird gotchas like you describe.

I recommended reading this to a sibling comment, and I'll recommend it to you too: https://ansuz.sooke.bc.ca/entry/23

Yes, copyright law has obviously irrational results if you start trying to look at it only from a technical "but information is just 1s and 0s, you can't copyright 1s and 0s" perspective. The law does not care.

Which is why we have to think about the high level legal process that stable diffusion does, not so much the actual small technical details like "can you recover images from the neural net" or such.

>>TheMid+94
Well, there was a copyright case in Europe recently where an artist had taken a photograph, flipped it horizontally, and painted it.

It was deemed an original work by the court.

I can’t see how, with such a precedent, they could rule that SD doesn’t produce original works.

https://www.rangefinderonline.com/news-features/industry-new...

>>zacwes+(OP)
I keep getting reminded of the Tom Scott near-future video on how lawyers end up ruining the singularity. [1]

As the line between artificial neural nets and natural neural nets continue to blur, surely the same rules should apply to both?

Of course the counter-argument "some NNs are somehow different than others based on color [2]" walks you straight into an ongoing ethical minefield in the social sciences and biology. It's certainly going to be interesting times.

[1] https://www.youtube.com/watch?v=IFe9wiDfb0E Tom Scott, how lawyers ruined the singularity

[2] by analogy to https://ansuz.sooke.bc.ca/entry/23 what color are your bits

>>profes+A8
Seconded - you might even say a 'computer': https://en.wikipedia.org/wiki/Computer_(occupation)

>>zacwes+(OP)
We should see a lot of "transformative fair use" defenses for all generative model modalities.

https://www.nolo.com/legal-encyclopedia/fair-use-what-transf...

It's a fun twist that many even have "transformer" in the name.

>>chrisc+y2
They filed a separate issue for Github Copilot https://githubcopilotlitigation.com/

>>jrm4+hP
It depends. If names are different, character and plot details differ, etc., a book about students at a school for wizards battling great evil may be a not particularly imaginative rip-off and may even invite litigation if it's too close, but I'm guessing it wouldn't win in court. See also The Sword of Shannara and Tolkien. https://en.wikipedia.org/wiki/The_Sword_of_Shannara

Creators mimic styles and elements of others' works all the time. Unless an ML algorithm crosses some literal copying threshold, I fail to see it as doing anything substantially different from what people routinely do.

>>zacwes+(OP)
This just wreaks of the Luddism of the 19th century all over again.

https://en.wikipedia.org/wiki/Luddite

Good luck stopping the inertia of progress.

>>hhjink+ak
It depends to what degree it's literal copying. See e.g. the Obama "Hope" poster. [1] Though that case is muddied by the fact that the artist lied about the source of his inspiration. Had it in fact been an older photo of JFK in a similar pose, there probably wouldn't have been a controversy.

[1] https://en.wikipedia.org/wiki/Barack_Obama_%22Hope%22_poster

>>huggin+d01
> when doing paintovers on copyrighted images (VERY common)

What are you talking about? I've been doing drawing and digital painting as a hobby for a long time and tracing is absolutely not "VERY common". I don't know anybody who has ever done this.

> fan art where they paint trademarked characters (also VERY common)

This is true in the sense that many artists do it (besides confusing trademark law and copyright law: the character designs are copyright-protected, trademarks protect brand names and logos). However, it is not fair use (as far as I'm aware at least, I'm not a lawyer). A rightholder can request for fanart to be removed and the artist would have to remove it. Rightsholders almost never do, because fanart doesn't hurt them.

There's also more examples of it reproducing copyright-protected images, I pulled the "bloodborne box art" prompt from this article: https://arxiv.org/pdf/2212.03860.pdf But I agree with you that reproducing images is very much not the intention of Stable Diffusion, and it's already very rare. The way I see it, the cases of Stable Diffusion reproducing images too closely is just a gotcha for establishing a court case.

>>grandm+H61
Yeah, and their next figure isn't any better. They show a latent space interpolation figure from DDPM, and they seem to think this is how Diffusion models produce a "collage" (as they describe the process). Of course, this figure has nothing to do with how image generation is actually performed. It's just an experiment for the purpose of the paper to demonstrate that the latent space is structured.

In fact, this only works because the source images are given as input to the forward process - thus, the details being interpolated are from the inputs not from the model. If you look at Appendix Figure 9 from the same paper (https://arxiv.org/pdf/2006.11239.pdf) it is clear what's going on. Only when you take a smaller number of diffusing (q) steps can you successfully interpolate. When you take a large number of diffusing steps (top row of figure 9), all of the information from the input images is lost, and the "interpolations" are now just novel samples.

It's very hard for me to find a reason to include Figure 8 but not Figure 9 in their lawsuit that isn't either a complete lack of understanding, or intentional deception.

>>IncRnd+V8
You've made some errors in reasoning.

First, there is a legal definition of a "derivative work" and there is an artistic notion of a "derivative work". If the two of us both draw a picture of the Statue of Liberty, artistically we have both derived the drawing based on the original statue. However, neither of these drawings in relation to the original sculpture nor the other drawing is legally considered a derivative work.

Let's think about a cartoonish caricature of Joe Biden. What "makes up" Joe Biden?

https://www.youtube.com/watch?v=QRu0lUxxVF4

To what extent are these "constituent parts" present in every image of Joe Biden? All of them? Is the latent space not something that is instead hidden in all images of Joe Biden? Can an image of Joe Biden be made by anyone that is not derived from these "high order" characteristics of what is recognizable as Joe Biden across a number of different renderings from disparate individuals?

>>zowie_+K51
>and tracing is absolutely not "VERY common"

Paintover does not have to mean actual 'tracing', a LOT of artists use photos as direct references and paint over them in a separate layer, keeping the composition, poses, colors very close to the original while still changing details and style enought to make it transformative enough to be considered a 'new work'.

Here are two examples of artist Sam Yang using two still frames from the tv show Squid Game and painting over those, the results which he then sells as prints:

https://www.inprnt.com/gallery/samdoesarts/the-alleyway/ https://www.inprnt.com/gallery/samdoesarts/067/

That said, you could even get away with less transformation and still have it be considered original work, take Andy Warhol's 'Orange Marilyn' and 'Portrait of Mao', those are inked and flat color changes over photographs.

>>huggin+9m1
First of all, those are only two works in a very large body of works of an artist that seems to work almost entirely from imagination, which already counters the claim that this is a very common way of working, since even this artist would almost never work like that. Secondly, putting strangely much effort into a comment on Hacker News, I actually looked up the source frame of one of these: https://youtu.be/K6hOvyz65jM?t=236 It's definitely based on the frame but it's not a paint-over as you claim. I know this because there are too many mistakes with regards to proportion:

- Extending the slant roof in the background, it intersects with the left figure at around the height of the nose, but in the painting it intersects with the middle of her neck.

- Similarly the line of the fence on the left is at the height of her hairline, but in the painting it is at the height of the middle of the head, and also more slanted than in the frame.

- On the right side, the white part of the pillar is similarly too low compared to the figure.

- The pole in the background has a lot of things off with regards to size, thickness, or location too.

Essentially, everything is a bit off with regards to location, size and distance. It doesn't really make sense to paint over something and then still do everything differently from the base layer, so it was probably just drawn from reference the normal way -- probably having the picture on another screen and drawing it again from scratch, rather than directly painting over the frame.

I agree with regards to Warhol but that doesn't really establish it as very common amongst painters.

>>zowie_+fv1
>that seems to work almost entirely from imagination

I very much doubt that.

>Secondly, putting strangely much effort into a comment on Hacker News

Note sure what you are implying here, could you elaborate ? The reason I know about these images is because they've been posted, alongside many other similar examples, in discussions regarding AI art.

>I know this because there are too many mistakes with regards to proportion:

Have you ever used programs like Photoshop, Krita et al ? You can start painting directly over a photo, and then easily transform the proportions of all components in the image, and since you draw them in layers, they can be done without affecting eachother.

Here they are, side by side:

https://imgur.com/a/tIbBkk2 https://imgur.com/a/K1fEPtu

I have no doubt that he started painting these over the reference photos, and then used the 'warp tool' in his painting program of choice to alter the proportions, a very common technique.

And this is PERFECTLY FINE, the resulting artwork is transformative enough to be considered a new work of art, which is true for practically every piece of art I've seen generated by Stable Diffusion, the only one I've seen that I'm doubtful about is the 'bloodborne box art' one, which is THE example that is always brought up as it such an outlier.

>>huggin+5G1
> I very much doubt that.

You can see his actual workflow on his YouTube channel. He shows his painting process there but doesn't show his sketching process, but I hope that you believe that people are able to draw from imagination at least.

https://www.youtube.com/watch?v=7_ZLBKj_UlY

> Note sure what you are implying here, could you elaborate?

I just meant I was probably putting to much effort into an online discussion.

> I have no doubt that he started painting these over the reference photos, and then used the 'warp tool' in his painting program of choice to alter the proportions, a very common technique.

It's simply not a common technique at all. I'm not sure why you're making these statements because it feels like your knowledge of how illustrators work is extremely limited. I've heard of people photobashing -- which is when artists combine photo manipulation and digital painting to more easily produce realistic artworks. It's got mixed opinions about it and many consider it cheating but within the field of concept art it's common because it's quick and easy. However, there's huge amounts of people who can just draw and paint from sight or imagination. There's the hyperrealists who often act as a human photocopier, but artists who do stylized art of any kind are just people who can draw from imagination. I'm not sure why that's something you "very much doubt" to be quite honest. Just looking on YouTube for things like art timelapses, you can find huge amounts of people who draw entirely from imagination. Take Kim Jung Gi as a somewhat well known example. That guy was famous amongst illustrators for drawing complicated scenes directly in pen without any sketches. But there's really plenty of people that can do these things.

You seem to be under the impression that the average artist uses every shortcut available to get a good result, but that is simply not true. Most artists I know refuse to do anything like photobashing because they consider it cheating and because it isn't how they want to work, nevermind directly drawing on top of things. Drawing from sight isn't uncommon as a way to study art, so in case you're wondering why Sam Yang would be able to reproduce the frame so closely, it's because that's how artists study painting.

> Have you ever used programs like Photoshop, Krita et al

Yes, very often. The thing is: Just because it's possible does not mean it actually happens.

>>huggin+d01
> My assumption would be 'fair use'.

Why? That's not obvious to me at all.

These algorithms take the entire image and feed it into their maw to generate their neural network. That doesn't really sound like "fair use".

If these GPT systems were only doing scholarly work, there might be an argument. However, the moment the outputs are destined somewhere other than scholarly publications that "fair use" also goes right out the window.

If these algorithms took a 1% chunk of the image, like a collage would, and fed it into their algorithm, they'd have a better argument for "fair use". But, then, you don't have crowdsourced labelling that you can harvest for your training set as the cut down image probably doesn't correspond to all the prompts that the large image does.

> Stable Diffusion does not create 1:1 copies of artwork it has been trained on

What people aren't getting is that what the output looks like doesn't matter. This is a "color of your bits" problem--intent matters.

This was covered when colorizing old black and white films: https://chart.copyrightdata.com/Colorization.html "The Office will register as derivative works those color versions that reveal a certain minimum amount of individual creative human authorship." (Edit: And note that they were colorizing public domain films to dodge the question of original copyright.)

The current algorithms injest entire images with the intent to generate new images from them. There is no "extra thing" being injected by a human--there is a direct correspondence and the same inputs always produce the same outputs. The output is deterministically derived from the input (input images/text prompt/any internal random number generators).

You don't get to claim a new copyright or fair use just because you bumped a red channel 1%. GPT is a bit more complicated than that, but not very different in spirit.

>>puppyd+t32
It is like someone breaking into your house, taking all of your furniture, works, and assets without your permission and then selling it back to you or to the highest bidder.

It seems almost everyone here in this thread is fine with such a grift on digital artists but when it is Copilot or ChatGPT; two years ago it was: 'Hardly going to compete against developers', with ChatGPT it became 'But juniors are only affected, not us seniors' and with GPT-4 + Copilot it will be: 'Please stop using AI code and sue GitHub now!'

Obviously this wasn't the case with Dance Diffusion (music version of Stable Diffusion) and that was trained on public domain music or the permission of musicians. It is almost as if that they knew if they did train it on copyrighted music and released it as open source, Stability AI would be out of business before they could counter the lawsuit. [0]

It is indeed a grift and the legal system will catch up on both Copilot and Stable Diffusion on using for copyrighted content in the training set of their AI models.

[0] https://techcrunch.com/2022/10/07/ai-music-generator-dance-d...

>>ouid+rb2
>You can't rip something and compress it badly enough to not violate copyright when you sell it.

While I doubt that specific case has been tested in court, arguably you could. If you created glitch art (https://en.wikipedia.org/wiki/Glitch_art) via compression artifacts, and your work was sufficiently distinct from the original work, I think you would have a reasonable case for transformative use (https://en.wikipedia.org/wiki/Transformative_use).

>>__rito+3S
Stability AI already drew this line with Dance Diffusion (like SD but for musicians) [0] trained on public domain music and on copyrighted music only with the permission from musicians?

The fact that Stabiliity is now creating an opt-out for artists after lifting and training on copyrighted / watermarked art without permission and creating a paid SaaS solution out of it, shows that not only they willfully trampled on the copyright of artists, but they have set themselves on a weak explanation on the 'fair use, transformative' argument since the LAION-5B model contains the copyrighted images which can output verbatim / highly similar digital art by the model.

The input from 'real experts like Bengio, LeCun' add little to no value in the case as digital art generated by a non-human is uncopyrightable and is public domain by default. [1] What sets the precedent is whether if using copyrighted content in a training set without permission from the author and outputting verbatim or highly similar derived works and commercializing that is fair use and not infringing.

If SD drew this line or musicians, then that should be the line drawn for digital art and code, and both Copilot, Midjourney, SD and DALL-E should be trained on public domain content or content with the permission from the content author or licenses that allow AI training.

So far, the 'real' grifters are Stability AI, OpenAI and Midjourney.

[0] https://techcrunch.com/2022/10/07/ai-music-generator-dance-d...

[1] https://www.copyright.gov/rulings-filings/review-board/docs/...

zlacker

We’ve filed a law­suit chal­leng­ing Sta­ble Dif­fu­sion

We’ve filed a lawsuit challenging Stable Diffusion