That’s going to be hard to argue. Where are the copies?
“Having copied the five billion images—without the consent of the original artists—Stable Diffusion relies on a mathematical process called diffusion to store compressed copies of these training images, which in turn are recombined to derive other images. It is, in short, a 21st-century collage tool.“
“Diffusion is a way for an AI program to figure out how to reconstruct a copy of the training data through denoising. Because this is so, in copyright terms it’s no different from an MP3 or JPEG—a way of storing a compressed copy of certain digital data.”
The examples of training diffusion (eg, reconstructing a picture out of noise) will be core to their argument in court. Certainly during training the goal is to reconstruct original images out of noise. But, do they exist in SD as copies? Idk
If you succeed, you will undo decades of technological progress.
On the one hand, using works without consent or attribution is bad.
On the other hand... This is exactly how humans train to become artists: by studying and remixing the art of others.
I think this has an almost 0 chance of success.
There is no bar you get to set except "better than the last attempt"
Just no, that's not how any of that works.
I guess that lie is convenient to legitimate the lawsuit.
The conglomerates who already have a bunch of IP they can feed into those systems, who can afford to purchase or through brute force (ie cheaply employed artists) create new works using their already massive amounts of capital. These are the entities that will have complete and total control over the best versions of the tools that artists say will bring about their doom. You better fucking believe they'll have their own licensing system too.
Copyright is a prison built for artists by big business, successfully marketed to artists as being a home.
1. They win, and big companies are training AI on their portfolio and new artwork which they paid for resulting in the same problem, that artists will even be paid less than today. Also resulting in incredible difficult to judge laws. Has an artist created it himself, or has he used copyrighted works to train his custom model?
2. They lose
There's never been since the industrial revolution a technology which has been "uninvented" so I'm going to guess that won't happen here either. The cat is out of the bag.
Reproducing parts of existing images in the dataset is called overfitting and is considered a failure of the model.
The law doesn't recognize a mathematical computer transformation as creating a new work with original copyright.
If you give me an image, and I encrypt it with a randomly generated password, and then don't write down the password anywhere, the resulting file will be indistinguishable from random noise. No one can possibly derive the original image from it. But, it's still copyrighted by the original artist as long as they can show "This started as my image, and a machine made a rote mathematical transformation to it" because machine's making rote mathematical transformations cannot create new copyright.
The argument for stable diffusion would be that even if you cannot point to any image, since only algorithmic changes happened to the inputs, without any human creativity, the output is a derived work which does not have its own unique copyright.
This is the fundamentally flawed and misguided argument that can literally be applied to any technological progress to curtail advancement.
Imagine if the medical tricorder (a device from Star Trek that does maybe 99% of what modern doctors do) is suddenly invented today. Doctors could use this argument to defend their livelihoods, but they lose sight of the fact that doctors don’t exist because society needs to employ them. They exist because we have a problem of people getting sick… if more sick people can be helped then great! That is an advancement for society because more lives are saved (as opposed to more doctors being employed), and then simply the standard for what doctors are expected to do is raised to a higher level, since someone is still expected to operate tricorders.
Similarly, artists exist for their output for society. If these AI models can truly fulfill the needs of society that artists currently output (that is debatable), then that simply raises the bar for what artists are expected to output. But it doesn’t change the fact that we only care about the output for society (which can never be truly harmed by advancements such as this because if someone can not outperform the AI then they are redundant), not the fact that artists exist.
Put another way, many current artists who fear this are simply doing the generative work of AI already… manually. The AI is democratizing art so that the lowest hanging fruit of art is now accessible to more people. The bar for art has now been raised so that the expected quality of newer work is to be much higher. Just like how after computer aided design was invented the quality of movie effects, digital game art, etc, all jumped. Progress means those doing current “levels” of art will need to add this tool to their repertoire to build more impressive things. Rent seeking and staying in place (from an artistic advancement point of view) is not the answer.
As someone else put it in a comment here, looking at other works of art and learning how to make art and creating new art from this influence is literally how humans have been doing it for eons. Everyone is standing on the shoulders of giants. This AI merely makes it explicit so I guess it brings out the rent seeking feeling since people must feel it’s now possible to quantify the amount their own work contributed to something. I guess if you don’t want anyone to be influenced by it—AI included—the traditional way is to not show it to anyone.
I think the lawsuits have a case if they can prove copyrighted images were taken and used directly in the commercial product.
The moral arguments about AI and it's use and abuse are broad and difficult, because they rely on the intent of the end user rather than the developer of the tool to prove an argument has merit.
I personally don't like AI art, but I can't declare that it's the devil's tool and should be purged with fire. I only think it overall will be a net negative, like I feel with AutoTune. It cheapens the artform.
Instead of improving the world, creating better tools they want to sue eachother. I thought those times were over
Or maybe it is just bias against Microsoft
If so the correct analogy is not a collage but a musical scale and yes Beethoven took musical notes that Bach had used but it was not exactly a copy.
It's like how humans learn. It contains chunks of copywriter material. What about copilot. Hackers don't respect artists. Yadda yadda. It's boring.
I don't know the answer, but after reading the same things over and over I don't know if I trust myself to even have a valid opinion about it.
In fairness, Diffusion is arguably a very complex entropy coding similar to Arithmetic/Huffman coding.
Given that copyright is protectable even on compressed/encrypted files, it seems fair that the “container of compressed bytes” (in this case the Diffusion model) does “contain” the original images no differently than a compressed folder of images contains the original images.
A lawyer/researcher would likely win this case if they re-create 90%ish of a single input image from the diffusion model with text input.
At most, I’d expect copyright legislation around training to slightly delay commercial mass-deployment. Given the huge socio-technical transition that is ahead of us, it’s probably a good thing to let people have a chance to form an opinion before opening the floodgates. Judging by our transition into ad-tech social media, I’m not exactly confident that we’ll end up in a good place, even if the tech itself has a lot of potential.
If the artist makes the image very similar to one of the reference photos, it may be a copyright violation. It doesn't matter if the artist used a pencil or software to create the new work.
Current AI image generation does, however, make it easy to unknowingly violate copyright. If it generates an image similar to something else out there you wouldn't know.
I don't know much about copyright law though, am I wrong?
I didn't like the Sunny Bono law. It exists. The questions about law need to be asked.
Or recordable tape, or VCRs, or art reprints.
And then there were the endless lawsuits over the "theft" from sampling and remixes.
The medium evolves. Some artists evolve with them, while others are left behind.
Do you have evidence that this is actually what the courts have decided with respect to NNs?
I understand people’s livelihoods are potentially at stake, but what a shame it would be if we find AGI, even consciousness but have to shut it down because of a copyright dispute.
I feel like this is the main distinction.
Oh, one image is enough to apply copyright as if it were a patent, to ban a process that makes original works most of the time?
The article authors say it works as a "collage tool" trying to minimise the composition and layout of the image as unimportant elements. At the same time forgetting that SD is changing textures as well, so it's a collage minus textures and composition?
Is there anything left to complain about? unless, by draw of luck, both layout and textures are very similar to a training image. But ensuring no close duplications are allowed should suffice.
Copyright should apply one by one, not in bulk. Each work they complain about should be judged on its own merits.
To quote "You can't put the genie back in the bottle"
And unless I start to see some evidence that this is theft, by an identical piece in a collage being "created" by AI (impossible I think) - then I have ZERO time to consider how they are being stolen or thinking of compensation ideas.
I say this with an arts degree - it sucks for struggling artists, but I only find it nominally sad that a bunch of people are gunna be out of work making Art for the McCorpos -
If you are one of those artists then i'm sorry if I sound flippant or cruel about this, but you can adapt or you will die. Not literally, of course.
But real artists know its the message not the medium - and they aren't in danger of dying off. They adapt.
The output of stable diffusion isn't possible without first examining millions of copyrighted images
Then the suit looks a little more solid, because (as you pointed out) it isn't possible for the stable diffusion owner to know which of those copyright images had clauses that prevents stable diffusion trading and similar usage.
The whole problem goes away once artists and photographers starting using a license that explicitly removes any use of the work as training data for any automated training.
I suspect adding OpenAI as a defendant would make things a tad harder legally.
I think it is likely github will do the same with copilot.
Thus, when we see things, we have already built a relationship map of the parts of an image, not actual pixels. This makes it possible to observe the world and interact with it in real time referencing the pieces and the concepts we label them with, otherwise we'd have to stop and very carefully look around every single time we wanted to take a step.
These networks effectively do the same thing, taking in parts of images and their relationships. It's not uncommon for me to see what is clearly a distorted copy of a Getty Images trademark when I run stable diffusion locally. There's an artist who always puts his daughter Nina's name in his work... the network just thinks its just another style, and I suspect that's same for the Getty thing.
One thing that is super cool is you can draw a horribly amateur sketch of something, and have Stable Diffusion turn it into something close to the starting drawing in outline, but far better in detail.
A sketch of a flower I did came out as Tulips, Roses and Poppy depending on the prompts used to process it, but it was generally in the same pose and scale.
[1] https://developer.tobii.com/xr/learn/eye-behavior/visual-ang...
Producing more topsoil would be a huge win here on Earth and make colonizing other planets dramatically more feasible.
Don't underestimate worms!
"But the police, however useless, were by no means idle: several notorious delinquents had been detected; men liable to conviction, on the clearest evidence, of the capital crime of poverty; men, who had been nefariously guilty of lawfully begetting several children, whom, thanks to the times!—they were unable to maintain. Considerable injury has been done to the proprietors of the improved frames. These machines were to them an advantage, inasmuch as they superseded the necessity of employing a number of workmen, who were left in consequence to starve. By the adoption of one species of frame in particular, one man performed the work of many, and the superfluous labourers were thrown out of employment."
..
"The rejected workmen, in the blindness of their ignorance, instead of rejoicing at these improvements in arts so beneficial to mankind, conceived themselves to be sacrificed to improvements in mechanism. In the foolishness of their hearts, they imagined that the maintenance and well doing of the industrious poor, were objects of greater consequence than the enrichment of a few individuals by any improvement in the implements of trade which threw the workmen out of employment, and rendered the labourer unworthy of his hire. And, it must be confessed, that although the adoption of the enlarged machinery, in that state of our commerce which the country once boasted, might have been beneficial to the master without being detrimental to the servant;"
..
"These men never destroyed their looms till they were become useless, worse than useless; till they were become actual impediments to their exertions in obtaining their daily bread. Can you then wonder, that in times like these, when bankruptcy, convicted fraud, and imputed felony, are found in a station not far beneath that of your Lordships, the lowest, though once most useful portion of the people, should forget their duty in their distresses, and become only less guilty than one of their representatives? But while the exalted offender can find means to baffle the law, new capital punishments must be devised, new snares of death must be spread, for the wretched mechanic who is famished into guilt. These men were willing to dig, but the spade was in other hands; they were not ashamed to beg, but there was none to relieve them. Their own means of subsistence were cut off; all other employments pre-occupied; and their excesses, however to be deplored and condemned, can hardly be the subject of surprise."
..
"The present measure will, indeed, pluck it from the sheath; yet had proper meetings been held in the earlier stages of these riots,—had the grievances of these men and their masters (for they also have had their grievances) been fairly weighed and justly examined, I do think that means might have been devised to restore these workmen to their avocations, and tranquillity to the country."
I find this title somewhat unclear. Is it filled by or against DeviantArt?
You can't bring back the training images no matter how hard you try.
No, this is the only fundamentally correct way to view this. Before the existence of the printing press, we didn't need copyright law. Yet all that the printing press did was make transcribing books by hand faster.
Quantitative changes enabled by technology are qualitative changes. And not every form that a qualitative change takes is one that leaves the world better off than we found it.
Automatic machine language translation puts translators out of work and would not be possible without huge amounts of ostensibly unlicensed training data.
um, yes.[1][2] What else would they be trained on?
According to the model card:
[1] https://github.com/CompVis/stable-diffusion/blob/main/Stable...
it was trained on this data set(which has hyperlinks to images, so feel free to peruse):
There's very much an arbitrary judgment call here. Bob James probably had a case with Run DMC Peter Piper (which is take me to the Mardi gras) but he's always been really chill about it. They lucked out.
Same thing here. There's this amorphous "creative effort" that creates an abstract distance between the works. Unless the engineers can show an effort to respect and police this distance, I think things might get dicey
You looked at my art, now I can use copyright against the copies in your brain.
The fact that the derivation involves millions of works as opposed to a single one is immaterial for the copyright issue.
I don't know how the law on copyright relates to AI. I can't imagine all of the arguements, I don't know the case law. I may have some gut feeling, I may want a certain outcome, but I am a nobody.
I think in a common-law system a court room and a test case is needed to clarify this law.
I am sad that the system for getting this into court is so expensive. I am sad that people involved are being threatened with damages, I don't think that is necessary.
Show me one worm — or even a hamster — that can drive a car as well as Tesla's Autopilot.
Show me one single non-human animal of any species that can write functioning Python code or CSS from a natural language description of what it should achieve, even if it only gets to GPT-2 level let alone v3 as used in Copilot or v3.5 as used in ChatGPT.
What we have yet to do with a worm is literally upload it's brain, but IIRC that's more because nobody with the money was interested in funding the one person publicly interested in figuring out how to measure synaptic weights in vivo.
We only have mere limited access to Copilot. And it is impractical for almost anyone else on earth to train a similar model, while we are 100% sure it is possible to have a dataset or to redo the training of SD. Just from pure utilitarian point of view, it's much easier to support fighting against Copilot than SD
so, digits of pi anyone?
If I train on one image I can get it right back out. Even two, maybe even a thousand? Not sure what the line would be where it becomes ok vs not but there will have to be some answer.
You can claim intellectual property for comic characters etc, but not photos.
YOu could get someone to go get a similar photo taken at the same place, and use tools to enhance it
- expressing personal feelings about the lawsuit
- honestly sharing one's lack of understanding of the legal or technical issues involved
- making brief, unsubstantiated claims about the lawsuit's merits
- discussing HN response to this story (as this comment)
I was really looking forward to educating myself on this topic from the HN comments, but it appears I am out of luck. Why is this (currently #1) story not attracting posts from the many HN users who could offer detailed and insightful analysis? Is it considered a toxic or an overly-politicized topic that most people avoid? Or there's some other explanation I'm not seeing?
why does it matter how it was trained? The question is, does the generative AI _output_ copyrighted images?
Training is not a right that the copyright holder owns exclusively. Reproducing the works _is_, but if the AI only reproduces a style, but not a copy, then it isn't breaking any copyright.
This seems like it’s not an accurate description of what diffusion is doing. A diffusion model is not the same as compression. They’re implying that Stable Diffusion is taking the entire dataset and making it smaller then storing it. Instead, it’s just learning patterns about the art and replicating those patterns.
The “compression” they’re referring to is the latent space representation which is how Stable Diffusion avoids having to manipulate large images during computation. I mean you could call it a form of compression, but the actual training images aren’t stored using that latent space in the final model afaik. So it's not compressing every single image and storing it in the model.
This page says there were 5 billion images in the stable diffusion training dataset (albeit that may not be true as I see online it’s closer to the 2 billion mark). A Stable Diffusion model is about 5 gb. 5 gb / 5 billion is 1 byte per image. That’s impossible to fit an image in 1 byte. Obviously the claim about it storing compressed copies of the training data is not true. The size of the file comes from the weights in it, not because it’s storing “compressed copies”. In general, it seems this lawsuit is misrepresenting how Stable Diffusion works on a technical level.
I can't see the artists winning.
There's a world of difference that you are just writing off.
A student can read/watch many works. Under this ruling, copyrighted works can no longer be learnt from by a student.
The existing set of rights granted under copyright does not include this.
HN is not a person, it's a forum with lots of people with different opinions. Depending on dozens of factors (time of day, title of the article, who gets in first) different opinions will dominate.
I've seen threads on Copilot that overwhelmingly come down in favor of Microsoft and threads on Stable Diffusion that come down hard against it. Also, even in a thread that has a lot of one opinion, there are always those who express the opposite view.
> we’ve filed a class-action lawsuit against Stability AI, DeviantArt, and Midjourney for their use of Stable Diffusion, a 21st-century collage tool that remixes the copyrighted works of millions of artists whose work was used as training data.
People will still be creative and make art even without society consuming their output. But society can create incentives to reward people for making art.
No it doesn't, it means that abstract facts related to this image might be stored.
The data must be encoded with various levels of feature abstraction for this stuff to work at all. Much like humans learning art, if devoid of the input that makes human art interesting (life experience).
I think a more promising avenue for litigating AI plagiarism is to identify that the model understands some narrow slice of the solution space that contains copyrighted works, but is much weaker when you try to deviate from it. Then you could argue that the model has probably used that distinct work rather than learned a style or a category.
[1] The Mona Lisa (/ˌmoʊnə ˈliːsə/ MOH-nə LEE-sə; Italian: Gioconda [dʒoˈkonda] or Monna Lisa [ˈmɔnna ˈliːza]; French: Joconde [ʒɔkɔ̃d]) is a half-length portrait painting by Italian artist Leonardo da Vinci. Considered an archetypal masterpiece of the Italian Renaissance,[4][5] it has been described as "the best known, the most visited, the most written about, the most sung about, the most parodied work of art in the world".
[2] https://huggingface.co/spaces/stabilityai/stable-diffusion
[3] https://imgur.com/a/L2LDOS4
EDIT: With the Starry Night it worked even better. But it failed to reproduce the Bathing of a Red Horse (that one doesn't have a page on English wiki, so I took the description from elsewhere).
Interesting that they mention collages. IANAL but it was my impression that collages are derivative work if they incorporate many different pieces and only small parts of the original. Their compression argument seems more convincing.
All mentioned models were trained on a large batch of works taken from Internet - no one is going to argue that.
All of those models cannot exists without content created by people.
The models allow to mimic the styles of particular artists making it a perfect tool to kill creativity and artists diversity in the long term.
But artists have been making "in the style of" works for probably millennia. Fan art is a common example.
I suppose the advent of software that makes it easy to make "in the style of" works will force us to get much more clear on what is and isn't a copy. How exciting.
However, I don't see how the software tool is directly at fault, just the person using it.
i wrote an OCR program in college. we split the data set in half. you train it on one half then test it against the other half.
you can train stable diffusion on half the images, but then what? you use the image descriptions of the other half and measure how similar they are? in essence, attempting to reproduce exact replicas. but i guess even then it wouldn't be copyright if those images weren't used in the model. more like me describing something vividly to you and asking you to paint it and then getting angry at you because its too accurate
People who generally make less money than programmers - writers, artists, musicians - should stop their whining and their unfair uses of copyright to control their creative output.
Programmers who are getting shafted by big corporations using their code to build machine learning models need urgent protection and fairness.
It probably doesn't help that almost every copyright argument here gets made with the understanding of copyright prevalent in the American model - that is to say that copyright exists to promote the progress of science and the useful arts - whereas in many countries copyright is understood as existing because the creator of something has a moral right to control and ownership of whatever they made.
Obviously this lawsuit is in the U.S but I suppose even if it loses here other lawsuits in other countries, with a different understanding of copyright, might succeed.
The idea, in stage one, was to split a file into chunks and xor those with other random chunks (equivalent to a one-time pad), those chunks as well as the created random chunks then got shared around the networks, with nobody hosting both parts of a pair.
The next stage is that future files inserted into the network would not create new random chunks but randomly use existing chunks already in the network. The result is a distributed store of chunks each of which is provably capable of generating any other chunk given the right pair. The correlations are then stored in a separate manifest.
It feels like such a system is some kind of entropy coding system. In the limit the manifest becomes the same size as the original data. At the same time though, you can prove that any given chunk contains no information. I love thinking about how the philosophy of information theory interacts with the law.
The images here are a big part of the model, there is no way around it, that huge blob of numbers doesn't build itself from scratch. But i don't know if these lawsuits have much merit. There is so much money in AI that these companies can forgo scraping, and instead send thousands of people to take photos of pretty much every artwork for themselves.
Coal mining (if coal hadn't ever been mined, we'd probably have gotten large scale wind- and hydroelectric power much sooner).
Whitney's cotton gin, albeit due to it coming before automated cotton picking.
> Out of One, Many: Using Language Models to Simulate Human Samples
The proof of that is they even had to lie about how stable diffusion works on this website to make it convincing, that's a clear sign that they are in the wrong.
Even themselves discovered that the truth won't get them very far.
> Obviously this lawsuit is in the U.S but I suppose even if it loses here other lawsuits in other countries, with a different understanding of copyright, might succeed.
The cat is just out of the bag anyways, if machine learning is outlawed in the US, it will flourish somewhere else instead.
maybe an even worse analogy, but i stopped using Photoshop when css got good. no need to make images for rounded corners. no need for me to pay adobe. but no one is suing css.
> Copyright is a prison built for artists by big business, successfully marketed to artists as being a home.
I think (continuing this analogy) that copyright is indeed a home, but very few artists can afford to buy their own home, so they rent from corporate landlords, and the bigger ones are the worst ones to be tenants of.
I didn’t say it cuz I didn’t think it would resonate, but it’s a whole new world we are quickly entering.
Yes, on a technical level, those chunks are random data. On the legal side, however, those chunks are illegal copyright infringement because that is their intent, and there is a process that allows the intent to happen.
I can't really say it better than this post does, so I highly recommend reading it: https://ansuz.sooke.bc.ca/entry/23
And yet, we have decided that society is better off when authors can make money off their work.
There are two issues here
- the model needs to be carefully prompted (goaded) into copyright violation, so it is instigated to do it by excessive quoting from the original
- the replicated codes are usually boilerplate, common approaches or "famous" examples from books; in other words they are examples that appear in multiple places in the training set as opposed to just once
Do generic codes, boilerplate and API calls deserve protection? Maybe the famous examples do, but not every replicated code does.
Which of course then arrives at the problem: the original data plainly isn't stored in a byte exact form, and you can only recover it by providing an astounding specific input string (the 512 bit latent space vector). But that's not data which is contained within Stable Diffusion. It's equivalent to trying to sue a compression codec because a specific archive contains a copyrighted image.
If you take that tack, I'll go one step further back in time and ask "Where is your agreement from the original author who owns the copyright that you could use this image in the way you did?"
The fact that there is suddenly a new way to "use an image" (input to a computer algorithm) doesn't mean that copyright magically doesn't also apply to that usage.
A canonical example is the fact that television programs like "WKRP in Cincinnati" can't use the music licenses from the television broadcast if they want to distribute a DVD or streaming version--the music has to be re-licensed.
[1] https://scholarship.law.vanderbilt.edu/vlr/vol58/iss3/13/
For example facts in the phonebook are not copyrighted, the authors have to mix fake data to be able claim copyright infringement. Maybe the models could finally learn how many fingers to draw on a hand.
The guidance component is a vector representation of the text that changes where we are in the sample space. A change in the sample space changes likelihood so for the different prompts the likelihood of the same output image for the same input image will be different.
Since the model is trained to maximise the ELBO, it will produce a change closer to the prompt.
A good way to think about it is this: given a classifier, I can select a target class and compute the derivative of the input with respect to the target class, and apply the derivative to the input. This puts it closer to my target class.
From the perspective of some models (score models), they produce a derivative of the density (of the samples), so it’s a bit similar to computing a derivative via classifier.
The above was concerned with what the NN was doing.
The algorithm applies the operator a number of steps, and progressively improves the image. In some probabilistic models, you can think of this as an inverse of stochastic gradient descent procedure (meaning a series of steps) that, with some stochasticity, reach a high value region (the density).
However, it turns out that learning this operation doesn’t have to be grounded in probability theory and graphical models.
As long as the NN learns a sufficiently good recovery operator, diffusion will construct something based on the properties of the dataset that has been used.
At no point however are there condensed representations of images since the NN is not learning to produce an image from zero in one step. It merely learns to recover some operation applied to the input.
For the probabilistic view, read Denoising Diffusion Probabilistic Networks and references, in particular langevin dynamics. It includes citations to score models as well.
For the non probabilistic component, read Cold diffusion.
For using the classifier gradient to update an image towards another class, read about adversarial generation via input gradients.
An MPEG codec doesn't contain every movie in the world just because it could represent them if given the right file.
The white light coming off a blank canvas also doesn't contain a copy of the Mona Lisa which will be revealed once someone obscures some of the light.
The automobile revolutionized transportation, but also came with licensing requirements. (And more recently, we are finding to be responsible for a health and climate catastrophe, necessitating new restrictions on fuel economy, leaded gasoline, ICEs, etc.) You didn't need a license to walk or ride a bicycle, or ride a horse, but when we started putting people behind thousands of pounds of steel, all of a sudden we needed to come up with a myriad of new rules and restrictions on how automobiles could be used.
The printing press came with copyright laws. New and more destructive weapons and tools and chemicals came with more restrictions regarding their possession and expected use. The telephone and the computer combined allow robo calling and spam on an industrial level, and those particular uses of those new technologies are forbidden. Radio revolutionized communication, but we don't just let any random asshole blast static into the spectrum. We have narrowly curtailed, permitted and forbidden uses of it.
It would be far easier to name the technologies that net-benefited society, and did not need new rules around them, to prevent their destructive and damaging uses.
This one isn't looking to be one of them.
A license which should be opt-in, not opt-out.
Of course, it’s opt-out because they know, fundamentally, that most artists would not want to opt-in.
And it would be illegal for me to sell or distribute zipped copies of images without the copyright holder’s consent. Similarly there might be an argument for why Diffusion[1] specifically can’t be built with copyrighted images.
[1] which is just one part of something like Stable Diffusion
If you take a bad paper shredder that, say, shreds a photo into large re-usable chunks, run the photo through that, and tape the large re-usable chunks back together, you have a photo with the same copyright as before.
If you tape them together in a new creative arrangement, you might apply enough human creativity to create a new copyrighted work.
If you grind the original to dust, and then have a mechanical process somehow mechanically re-arrange the pieces back into an image without applying creativity, then the new mechanically created arrangement would, I suspect, be a derived work.
Of course, such a process don't really exist, so for the "shapeless dust" question, it's pretty pointless to think about. However, stable diffusion is grinding images down into neural networks, and then without a significant amount of human creativity involved, creating images reconstituted from that dust.
Perhaps the prompt counts as human creativity, but that seems fairly unlikely. After all, you can give it a prompt of 'dog' and get reconstituted dust, that hardly seems like it clears a bar.
Perhaps the training process somehow injected human creativity, but that also seems difficult to argue, it's an algorithm.
You can draw Biden yourself if you're talented and it's not considered a derivative of anything.
It can also be used to create works that are probably not copyright violations and yet seem to be unfair because they deliver large economic benefits to the deployers of the AI while relying on the uncompensated creativity of the original artists.
The interesting questions to me here are:
1. Should we attempt to modify copyright law to reflect this sense of unfairness?
2. Is it even possible to do so?
Instead of aiming to reproduce exact replicas, you use a classifier and retrieve the input of the last layer. Do it for both generated and original inputs, and then measure the differences in the statistics.
Wikipedia has a good article on this.
But that's not what people use Stable Diffusion for: people use Stable Diffusion to create new works which don't previously exist as that combination of colors/bytes/etc.
Artists don't have copyright on their artistic style, process, technique or subject matter - only on the actual artwork they output or reasonable similarities. But "reasonable similarity" covers exactly that intent - an intent to simply recreate the original.
People keep talking about copyright, but no one's trying to rip off actual existing work. They're doing things like "Pixar style, ultra detailed gundam in a flower garden". So you're rocking up in court saying "the intent is to steal my clients work" - but where is the clients line of gundam horticultural representations? It doesn't exist.
You can't copyright artistic style, only actual output. Artists are fearful that the ability to emulate style means commissions will dry up (this is true) but you've never had copyright protection over style, and it's not even remotely clear how that would work (and, IMO, it would be catastrophic if it was - there's exactly one group of megacorps who would now be in a position to sue everyone because try defining "style" in a legal sense).
This is the most salient point in this whole HN thread!
You can’t sue Stable Diffusion or the creators of it! That just seems silly.
But (I don’t know I’m not a lawyer) there might be an argument to sue an instance of Stable Diffusion and the creators of it.
I haven’t picked a side of this debate yet, but it has already become a fun debate to watch.
See https://openai.com/blog/dall-e-2-pre-training-mitigations/ "Preventing Image Regurgitation".
But back to your point “if you were to take the first sentence from a thousand books and use it in your own book”, then yes based on my understanding (I am not a lawyer) of copyright you would be in violation of IP laws.
The most common example of this (Greg Rutkowski) is not in StableDiffusion's training set.
Nothing points to that, in fact even in this website they had to lie on how stablediffusion actually works, maybe a sign that their argument isn't really solid enough.
> [1] https://arxiv.org/pdf/2212.03860.pdf
You realize those are considered defects of the model right? Sure, this model isn't perfect and will be improved.
That's the opposite goal of this image model. Sure you might find other types of research models which are meant to do that but that's not stablediffusion and the likes.
excellent description, thanks
I dunno if it matters that the opt-in has to be at the legislation level.
After all, once Creative Commons adds that clause to their most popular license, it's game over for training things like Stable Diffusion.
I'm thinking that maybe the most popular software licenses can be extended with a single clause like "usage as training data not allowed".
Of course, we cannot retroactively apply these licenses so the current model will still be able to generate images/code; they just won't be able to easily use any new ones without getting into trouble.
That said it can sometimes be in violation of copyright if it creates a specific image that is “too close to another original” (just like a human would be in violation even if they never previously saw that image).
But the above is just my intuition (and possibly yours) that doesn’t mean a lawyer couldn’t make the argument that it’s a ”good enough lossy compression - just like jpeg but smaller” and therefore “contains the images in just 2 bytes”.
That lawyer may fail to win the argument, but there is a chance that they do win the argument! Especially as researchers keep making Diffusion and SD models better and better at being compression algos (which is a topic people are actively working on).
You can call copying of input as a defect, but why are you simultaneously arguing that it doesn't occur?
It is a tool that allows radical combinations of concepts and styles, I'd say it empowers combinatorial exploration. And the combinatorial space is very sparsely explored.
The only thing so far discovered is either a) older public domain works nearly fully reproduced b) small fragments of newer works or c) "likenesses"
What if the model is trained on his own works, works that are also licensed out, can the artist be sued for infringing by creating in the same style?
Outliers make bad examples. They don't share much similarity with other works.
It's both undesirable and not relevant to this kind of lawsuit.
This is not necessary because the model was trained in Germany and the law there explicitly says you don't need to do that.
If you asked copilot to reproduce an existing work, then surely that violates copyright - in the same way you can ask SD to reproduce one of the training data (which would violate copyright in the same way).
But both the training, and the usage of these ML models do not violate copyright. Only until someone produces a copyrighted works from it, does that particular _usage_ instance will violate copyright, and it does not invalidate any other usages.
The law can do whatever its writers want. The law is mutable, so the answer to your question is “maybe”.
Maybe SD will get outlawed for copyright reasons on a single image. The law and the courts have done sillier things.
They don't even produce the same image twice from the same description and a different random seed.
In France for instance an artist cannot transfer the moral rights over its art, but can transfer the commercial rights, and there is no copyright concept (which makes is funny when sites copy the US have have a copyright mention at the bottom)
Everyone make their own artistic judgements, nobody's ideas are better. If people prefer this https://lexica.art/ (scroll down) then that's their right.
Since SD is trained by gradient updating against several different images at the same time, it of course never copies any image bits straight into it. Since it's a latent-diffusion model, actual "image"ness is limited to the image encoder (VAE), so any fractional bits would be in there if you want to look.
The text encoder (LAION OpenCLIP) does have bits from elsewhere copied straight into it to build the tokens list.
https://huggingface.co/stabilityai/stable-diffusion-2-1/raw/...
It's bad news for art websites themselves if that's the case...
For all the non-problematic training images you can use the originals. Some artists might want their names to become popular as style keywords.
But yes, it may be infeasible to index and compare against every image ever uploaded.
Programmers love to share code, but they don't want to share it with corporations who don't give back. We invented copyleft as a way to (ab)use the legal system to open up everything. We hate copyright and "love" "copyleft" as a means to weaken copyright.
It would be like if artists gave away all of their art, except not to corporations who hog their copyrights.
I could always show that there exists some function f that produces said byte sequence when applied to my copyrighted material.
Can I sue Microsoft because the entire Windows 11 codebase is just one "rote mathematical transformation" away from the essay I wrote in elementary school?
It is a multi-front battle they face.
We can create incentives so more people become doctors, but the purpose of incentives at the end of the day must maximize life saving, whether it’s done by an AI or a doctor using an AI.
I imagine there are plenty of people that prefer Autotune voices for some reason too. Doesn't mean everyone needs to agree that it's good in itself or for artists.
Sure, the windows 11 codebase is in pi somewhere if you go far enough. Sure, pi is a non-copyrightable fact of nature. That doesn't mean the windows codebase is _actually_ in pi legally, just that it technically is.
The law does not care about weird gotchas like you describe.
I recommended reading this to a sibling comment, and I'll recommend it to you too: https://ansuz.sooke.bc.ca/entry/23
Yes, copyright law has obviously irrational results if you start trying to look at it only from a technical "but information is just 1s and 0s, you can't copyright 1s and 0s" perspective. The law does not care.
Which is why we have to think about the high level legal process that stable diffusion does, not so much the actual small technical details like "can you recover images from the neural net" or such.
It was deemed an original work by the court.
I can’t see how, with such a precedent, they could rule that SD doesn’t produce original works.
https://www.rangefinderonline.com/news-features/industry-new...
Copyright infringement can happen without intending to infringe copyright.
Various music copyright cases start with "Artist X sampled some music from artist Y, thinking it was transformative and fair use". The court, in some of these cases, have found something the artist _intended_ to be transformative to in fact be copyright infringement.
> You can't copyright artistic style, only actual output
You copyright outputs, and then works that are derived from those outputs are potentially copyrighted. Stable Diffusion's outputs are clearly defined from the training set, basically by definition of what neural networks are.
It's less clear they're definitely copyright-infringing derivative works, but it's far less clearcut than how you're phrasing it.
So how could this situation have been reasonably avoided? Was it meant to be avoided? Even supposing this lawsuit succeeds and Stability AI is dissolved, it's not going to change the fact that people are going to keep using Stable Diffusion anyway, even if people stop themselves from talking about it in the open.
Another thing I've been mulling about is the negative effects of new technologies could be amplified if those technologies can be completely controlled by an independent person. As an example, 3D-printed gun are considered less dangerous than traditionally manufactured guns, but there was the incident where Shinzo Abe was assassinated using a homemade gun in spite of the extremely strict gun control measures in Japan. I think the conversation about that kind of danger would be intensely revisited if just anyone has the ability to manufacture a more efficient weapon using only things they can buy themselves.
Giving away the 8 gigabytes of Stable Diffusion for free with the means to run it on anyone's computer means that the harms that many people perceive about AI-generated art are going to exist forever. There is no solution to the problem now, only ideological differences to be repeated forever. To those that are against it, art style appropriation is now a mathematically defined fact, published as replicable research.
If the progress in generative art today breaks the spirits of artists, the small hope for me is that at least it was a technology that's unlikely to cause mass deaths or destruction. I have to wonder if humanity will be ready when that time does come for a different breakthrough. AI art at this stage can already cause mass die-offs of motivation and attention spans.
Which runs into some very interesting historical precedents.
((I wonder if there's a split between people who think AI emancipation might happen this century versus people who think that such a thing is silly to contemplate))
And it's not a new phenomenon. The printing press made scribes nearly obsolete, cars led to a mass slaughter of horses (and all related professions), and so on wherever you look. Invention of photography nearly killed whole genres of art (and gave rise to impressionism, abstractionism etc.)
Unfortunately, we can only judge the outcome of these changes only decades (or even centuries) after they happen.
Do artists mind a human browsing their work, or buying it? Not at all. Probably even learning from it is fine.
But "thanks for your hard work, my AI has it from here" seems justifiably objectionable for an artist, who would reasonably expect to charge money for this.
I feel for the creators. I'm glad AI is making progress. Not sure how the lawsuit will go.
This is what I think would be the easiest thing to prove.
Other people's work was used to create a product that is sold for a quite a bit of profit. No attribution, no compensation. To me it sounds like a pretty clear copyright violation, even if you don't consider what the product does.
DeviantArt probably violated their own TOS.
It'll be interesting to see how the courts view this.
As the line between artificial neural nets and natural neural nets continue to blur, surely the same rules should apply to both?
Of course the counter-argument "some NNs are somehow different than others based on color [2]" walks you straight into an ongoing ethical minefield in the social sciences and biology. It's certainly going to be interesting times.
[1] https://www.youtube.com/watch?v=IFe9wiDfb0E Tom Scott, how lawyers ruined the singularity
[2] by analogy to https://ansuz.sooke.bc.ca/entry/23 what color are your bits
The resolution is much weirder than that, the court argued that the pose isn't original enough for the photo to deserve copyrights at all, independently of what the plagiarist did with it.
You can’t sue Canon for helping a user take better infringing copies of a painting, nor can you sue Apple or Nikon or Sony or Samsung… you can sue the user making an infringing image, not the tools they used to make the infringing image… the tools have no mens rea.
Which is why this is framed as compression, it implies that fundamentally SD makes copies instead of (re)creating art. Leaving out the issue of recreating forgeries of existing works, using the training data for the creation of new pieces should be well covered inside the bounds of appropriation. Demanding anything more then filtering the output of SD for 1:1 reproductions of the training data is really pushing it.
edit: Checksums arent necessarily unique btw. See "Hash collisions".
Obviously some fairy reputable organisations and individuals are moderately confident that there isn't otherwise they wouldn't have done it.
The answer is of course not, and the same principle applies if someone uses Stable Diffusion to find a latent space encoding for a copyright image (the 231 byte number - had to go double check what the grid size actually is).
Let's stop for a moment and define advancement (or "progress", as it's sometimes called). It's always tacit, and never explicitly defined, and I think it bears examination.
By advancement/progress, I'm taking the argument to mean "betterment". i.e. When we say "advances in science", we're usually referring to things getting better, as a result of more science.
However, science/technology are not good in themselves. They're just tooling. You need to stop and ask which direction you've taken this advancement in the tooling in, because whether you meant it or not, both advancement and progress have direction.
> Similarly, artists exist for their output for society. If these AI models can truly fulfill the needs of society that artists currently output (that is debatable), then that simply raises the bar for what artists are expected to output
I somewhat agree, and would say this is very much like when the camera was invented. Artists lamented that they no longer had a purpose, until they invented one for themselves with surrealism. Art shifted from visual reproduction to meaning and feeling.
Surrealism is then a good example of the direction that the advancement in science (of the creation of the camera) took. What is the direction that AI generated art is taking?
You run into the pigeonhole argument. That level of compression can only work if there are less than seventy thousand different images in existence, total.
Certainly there’s a deep theoretical equivalent between intelligence and compression, but this scenario isn’t what anyone means by “compression” normally.
Regarding your edit, what are the chances of a "hash collision" where the hash is two MP4 files for two different movies? Seems wildly astronomical.. impossible even? That's why this hash method is so special, plus the built in preview feature you can use to validate your hash against the source material, even without access to the original.
If you think that this tech is remotely close to anything resembling general intelligence or a singularity, I've got an image model to sell you.
Discovery will show exactly what the base images for training are. You can view that the outputs are derivative works.
I don't think the mechanism is going to shield the violation, and frankly it shouldn't.
License your source material for the purpose and do it right. Doesn't everyone know it's wrong to steal?
People are treating this like its a binary technical decision. Either it is or isn't a violation. Reality is that things are spectrums and judges judge. SD will likely be treated like a remix that sampled copywritten work, but just a tiny bit of each work, and sufficiently transformed it to create a new work.
That's plainly untrue, as Stable Diffusion is not just the algorithm, but the trained model—trained on millions of copyrighted images.
Specifically fair use #3 "the amount and substantiality of the portion used in relation to the copyrighted work as a whole."
A sentence being a copyright violation would make every book review in the world illegal.
https://www.nolo.com/legal-encyclopedia/fair-use-what-transf...
It's a fun twist that many even have "transformer" in the name.
A recreation of a piece of art does not mean a copy, I've personally seen hundreds of recreations of Edvard Munch's 'The Scream', all of them perfectly legal.
Even in a massively overtrained model, it is practically impossible to create a 1:1 copy of a piece of art the model was trained upon.
And of course that would be a pointless exercise to begin with, why would anyone want to generate 1:1 copies (or anything near that) of existing images ?
The whole 'magic' of Stable Diffusion is that you can create new works of art in the combined styles of art, photography etc that it has been trained on.
If my parrot recites your song after hearing my alleged infringement, I record its performance and post it on YouTube is that infringement?
Last one, if I use the song from your website to train an song recognition AI is that infringement?
AI-focused companies should get together, form a group, and have real experts giving expert testimony under oath.
I also think that the courts will form expert committees consisting of real experts like Bengio, LeCun, etc.
Any grifters should be avoided, but I am not sure if the judges will understand the difference.
Creators mimic styles and elements of others' works all the time. Unless an ML algorithm crosses some literal copying threshold, I fail to see it as doing anything substantially different from what people routinely do.
Since md5 hashes don't share this property, they're not "in that vein".
The software itself is not at issue here. If they had trained the network on public domain images then there’d be no lawsuit. The legal question to settle is whether it’s allowable to train (and use) a model on copyrighted images without permission from the artists.
They may actually be successful at arguing that the outputs are either copies or derived works which would require paying the original artist for licenses.
That’s not how it works. Your collage would be fine if it was the only one since you used magazines you bought. Where you’d get into trouble is if you started printing copies of your collage and distributing them. In that case you’d be producing derived works and be on the hook for paying for licenses from the original authors.
If a person creates a perfect copy of something it shows they have put thousands of hours of practice into training their skills and maybe dozens or even hundreds of hours into the replica.
When a computer generates a replica of something it's what it was designed to do. AI art is trying to replicate the human process, but it will always have the stink of "the computer could do this perfectly but we are telling it not to right now"
Take Chess as an example. We have Chess engines that can beat even the best human Chess players very consistently.
But we also have Chess engines designed to play against beginners, or at all levels of Chess play really.
We still have Human-only tournaments. Why? Why not allow a Chess Engine set to perform like a Grandmaster to compete in tournaments?
Because there would always be the suspicion that if it wins, it's because it cheated to play at above it's level when it needed to. Because that's always an option for a computer, to behave like a computer does.
"Training a neural network" is an implementation detail. These companies accessed millions of copyrighted works, encoded them such that the copyright was unenforcable, then sell the output of that transformation.
Pretty sure this is nitpicking about an overused analogy though.
Compression that returns something different from the original most of the time, but still could return the original.
If my parrot recites your song after hearing it and I record that and upload to YouTube. I've violated your copyright.
If a big company does the same(runs the song through a non-human process, then sells the output) I believe they're blatantly infringing copyright.
Except with computers, they don't need to eat or sleep, converse or attend stand-ups.
And once you're able to draw that one picture, you could probably draw similar ones. Your own style may emerge too.
Just thinking. Copywriters, students, and scribes used to copy stuff verbatim, sometimes just to "learn" it.
The product of that study could be published works, a synthesis of ideas from elsewhere, and so on. We would say it belonged to the executor, though.
So the AI learned, and what it has created belongs to it. Maybe.
Or, once we acknowledge AI can "see" images, precedent opens the way to citizenship (humanship?)
What does this mean? It doesn't mean you can't recreate the original, because that's been done. It doesn't mean that literally the bits for the image aren't present in the encoded data, because that's true for any compression algorithm.
https://en.wikipedia.org/wiki/Luddite
Good luck stopping the inertia of progress.
If someone finds a way to reverse a hash, I'd also argue that hashing has now become a form of compression.
I think in 5 billion images there are more than enough common image areas to allow for average compression to become lower than a single byte. This is a lossy process, it does not need a complete copy of the source data, similar to how an MP3 doesn't contain most of the audio data fed into it.
I think the argument that SD revolves around lossless compression is quite an interesting one, even if the original code authors didn't realise that's what they were doing. It's the first good technical argument I've heard, at least.
All of those could've been prevented if the model was trained on public domain images instead of random people's copyrighted work. Even if this lawsuit succeeds, I don't think image generation algorithms will be banned. Some AI companies will just have spent a shitton of cash failing to get away with copyright violation, but the technology can still work for art that's either unlicensed or licensed in such a way that AI models can be trained based on it.
> At one time there were more British soldiers fighting the Luddites than there were fighting Napoleon on the Iberian Peninsula.
[1] https://en.wikipedia.org/wiki/Barack_Obama_%22Hope%22_poster
For example, I don't think Sarah Andersen, one of the plaintiffs here, would've reached the popularity she's gained now if it wasn't for her comics being shared on meme sites and social media, for example, and I don't see any "do not repost" watermarks on her recent work like others that do object to resharing on other platforms have started doing.
I think many artists and copyleft programmers have the same positions. The biggest difference is that drawings are considered "art" and code is generally not, despite that fact a technical drawing and business logic are both hardly artistic and mostly an expression of skill whereas the demo scene, the indie gaming scene, and many online artists are very much about expressing themselves within a given set of boundaries.
When Microsoft steals code licensed to Github, people considered that to be a license dispute more than a copyright dispute. I'd argue there is no difference at all between artists and programmers when it comes to their work being absorbed and then reproduced by an AI company.
Just like gzip, training stable diffusion certainly removes a lot of data, but without understanding the effect of that transformation of the entropy of the data it's meaningless to say thing like "two bytes per image" because(like gzip) you need the whole encoded dataset to recover the image.
It's compressing many images into 10GB of data, not a single image into two bytes. This is directly analogous to what people usually mean by "compression"
It's a tool. Folks get excited about statistics, massive datasets, and computer science is hip again.
Would we not want a push for folks to experience the exacting caress of an unforgiving compiler?
I thought this stuff would be easy!
Hopefully what doesn't happen is a fragmentation of folks into content caverns, where they may gaze into a mirror and see exactly what they wish, day after day. A literal instantiation of Plato's Caves, where scientific progress is frozen and forgotten.
1. No one knows. Real expert IP lawyers I know tend to think it's probably OK with various caveats.
2.) Under even simpler scenarios fair use is a pretty fuzzy line even for things that are better understood and often end up litigated on a case basis. Someone else gave various examples from music. I cited the Obama "Hope" poster elsewhere. Very little is absolute at the margins.
AFAIK, downloading and learning from images, even copyrighted images, fall under fair use, this is how practically every artist today learns how to draw.
Stable Diffusion does not create 1:1 copies of artwork it has been trained on, and its purpose is quite the opposite, there may be cases where the transformative aspect of a generated image may be argued as not being transformative enough, but so far I've only seen one such reproducable image, which would be the 'bloodborne box art' prompt, which was also mentioned in this discussion.
Simply appearing on a shared hosting site should not be enough.
Furthermore, copyright infringement doesn't stop being copyright infringement if you do it based on someone else's copyright infringement. Just become someone else decided to rip the contents of a CD and upload it to a website doesn't mean I'm now allowed to download it from that website again.
Copyright law does include an originality floor, you can't copyright a letter or a shape unless you're a billion dollar startup and in the same way that you can't copyright fizzbuzz or hello world. I don't think that's relevant for many algorithms Copilot will generate for you, though.
If simple work doesn't deserve protection, the pop music industry with their generic lyrics and simple tunes may be in big trouble. Disney as well, with their simplistic cartoon characters like Donald Duck and Mickey Mouse.
Personally, I think copyright laws are extremely damaging in their duration and restrictions. IP law in a small amount of countries actually allows for patenting algorithms, which is equally silly. International IP law currently gets in the way of society in my opinion.
However, without short term copyright neither programmers nor artists will be happy and I don't think anyone but knock-off companies will be happy with such an arrangement. Five or ten years is long enough for copyright in my book, but within those five or ten years copyright must remain protected.
It does have Mona lisa because of over fitting. But that's because there is too much Mona lisa on internet.
These artist taking part in suit won't be able to recreat any of their work.
Stable Diffusion is not made to decompress the original and actually has no direct mechanism for decompressing any originals. The originals are not present. The only thing present is an embedding of key components of the original in a multi-dimensional latent space that also includes text.
This doesn't mean that the outputs of Stable Diffusion cannot be in violation of a copyright, it just means that the operator is going to have to direct the model towards a part of that text/image latent space that violates copyright in some manner... and that the operator of the model, when given an output that is in violation of copyright, is liable for publishing the image. Remember, it is not a violation of copyright to photocopy an image in your house... it's a violation when you publish that image!
SD might know how to violate copyright but is that enough to sue it? Or can you only sue violations it helps create?
If that software happens to output an image that is in violation of copyright then it is not the fault of the model. Also, if you ran this software in your home and did nothing with the image, then there's no violation of copyright either. It only becomes an issue when you choose to publish the image.
The key part of copyright is when someone publishes an image as their own. That they copy an image doesn't matter at all. It's what they DO with the image that matters!
The courts will most likely make a similar distinction between the model, the outputs of the model, and when an individual publishes the outputs of the model. This would be that the copyright violation occurs when an individual publishes an image.
Now, if tools like Stable Diffusion are constantly putting users at risk of unknowingly violating copyrights then this tool becomes less appealing. In this case it would make commercial sense to help users know when they are in violation of copyright. It would also make sense to update our copyright catalogues to facilitate these kinds of fingerprints.
There are no models I know of with the ability to generate an exact copy of an image from its training set unless it was solely trained on that image to the point it could. In that case I could argue the model’s purpose was to copy that image rather than learn concepts from a broad variety of images to the point it would be almost impossible to generate an exact copy.
I think a lot of the arguments revolving around AI image generators could benefit from the constituent parties reading up on how transformers work. It would at least make the criticisms more pointed and relevant, unlike the criticisms drawn in the linked article.
What I object to is not the AI itself, or even that my code has been used to train it. It's the copyright for me but not for thee way that it's been deployed. Does GitHub/Microsoft's assertion that training sidesteps licensing apply to GitHub/Microsoft's own code? Do they want to allow (a hypothetical) FSFPilot to be trained on their proprietary source? Have they actually trained Copilot on their own source? If not, why not?
I published my source subject to a license, and the force of that license is provided by my copyright. I'm happy to find other ways of doing things, but it has to be equitable. I'm not simply ceding my authorship to the latest commercial content grab.
I'm surprised they couldn't find someone with even a rudimentary understanding of diffusion models to review this.
"The Night Watch, a painting made by Rembrandt in 1642"
It generates a convincing low-res imitation about half the time, but it also has a tendency to make the triband flag into an American flag, or put an old ship in the background, or replace the dark city arch with a sunset...
If you keep refining the prompt, you can get closer, but at that point you're just describing what the painting should look like, rather than asking the model to recall an original work.
Stable Diffusion is about closed to open.
Copilot is about open to closed.
The Stable Diffusion version of Copilot would be something like
"Give me a cart checkout algorithm in the style of Carmack, secure C style."
And that's fine, if the destination code license were just as open--or even less restricting--than the source codes' license (relicensing rules permitting).
What could be the issue is the generated source becomes even more closed or proprietary, which defeats the original source intent.
Is that right?
What are you talking about? I've been doing drawing and digital painting as a hobby for a long time and tracing is absolutely not "VERY common". I don't know anybody who has ever done this.
> fan art where they paint trademarked characters (also VERY common)
This is true in the sense that many artists do it (besides confusing trademark law and copyright law: the character designs are copyright-protected, trademarks protect brand names and logos). However, it is not fair use (as far as I'm aware at least, I'm not a lawyer). A rightholder can request for fanart to be removed and the artist would have to remove it. Rightsholders almost never do, because fanart doesn't hurt them.
There's also more examples of it reproducing copyright-protected images, I pulled the "bloodborne box art" prompt from this article: https://arxiv.org/pdf/2212.03860.pdf But I agree with you that reproducing images is very much not the intention of Stable Diffusion, and it's already very rare. The way I see it, the cases of Stable Diffusion reproducing images too closely is just a gotcha for establishing a court case.
What do you mean by this in the context of generating images via prompt? “Fractional bits” don’t make sense and it’s more misleading if anything. Regardless, a model violating criteria for being within fair use will always be judged by the outputs it generates rather than its composing bytes (which can be independent)
For lawyers to make money. That is the goal of much litigation.
Special agents from the MPAA sent to assassins an Android who can spew out high quality art.
tl;dr I think there's a distinction between training on copyrighted but public content and private content.
In fact, this only works because the source images are given as input to the forward process - thus, the details being interpolated are from the inputs not from the model. If you look at Appendix Figure 9 from the same paper (https://arxiv.org/pdf/2006.11239.pdf) it is clear what's going on. Only when you take a smaller number of diffusing (q) steps can you successfully interpolate. When you take a large number of diffusing steps (top row of figure 9), all of the information from the input images is lost, and the "interpolations" are now just novel samples.
It's very hard for me to find a reason to include Figure 8 but not Figure 9 in their lawsuit that isn't either a complete lack of understanding, or intentional deception.
First, there is a legal definition of a "derivative work" and there is an artistic notion of a "derivative work". If the two of us both draw a picture of the Statue of Liberty, artistically we have both derived the drawing based on the original statue. However, neither of these drawings in relation to the original sculpture nor the other drawing is legally considered a derivative work.
Let's think about a cartoonish caricature of Joe Biden. What "makes up" Joe Biden?
https://www.youtube.com/watch?v=QRu0lUxxVF4
To what extent are these "constituent parts" present in every image of Joe Biden? All of them? Is the latent space not something that is instead hidden in all images of Joe Biden? Can an image of Joe Biden be made by anyone that is not derived from these "high order" characteristics of what is recognizable as Joe Biden across a number of different renderings from disparate individuals?
People have posted illegal Windows source code leaks to GitHub. Microsoft doesn’t seem to care that much because these repos stay up for months or even years at a time without Microsoft DMCAing them-if you go looking you’ll find some right now. I think it is entirely possible, even likely, that some of those repos were included in Copilot’s training data set. So Copilot actually was trained on (some of) Microsoft’s proprietary source code, and Microsoft doesn’t seem to care.
Is it "the model cannot possibly recreate an image from its training set perfectly" or is it "the model is extremely unlikely to recreate an image from its training set perfectly, but it could in theory"?
Because I am willing to bet it's the latter.
> You’re acting like the “computer” has a will of it’s own. Generating a perfect copy of an image would be a completely separate task from training a model for image generation.
Not my intent, of course I don't think computers have a will of their own. What I meant, obviously, is that it's always possible for a bad actor of a human to make the computer behave in a way that is detrimental to other humans and then justify it by saying "the computer did it, all I did is train the model".
As an example of a plausible scenario where copyright might actually be violated, consider this: an NGO wants images on their website. They type in something like 'afghan girl' or 'struggling child' and unknowingly use the recreations of the famous photographs they get.
are we looking at the output of the same program? because all of the output images i look at have eyes looking in different direction and things of horror in place of hands or ears, and they feature glasses meting into people faces, and that's the good ones, the bad one have multiple arms contorting out of odd places while bent at unnatural angles.
If licenses don't apply to training, then they don't apply for anyone, anywhere. If they do apply, then Copilot is violating my license.
I think the post you’re replying to saw was confused about the quote above. The person who’s claiming copyright by showing the claimed file started as their own image has to show that it started from their own image, and not just that the file could have derived from the image. Copyright cares about both the works and the provenance of works.
Stable Diffusion couldn’t be flagged under this pretense if a person used a prompt that was their own nor could they even be sued if they ran an image through it as long as there is no plausibility that it was made by a copyright work. The only thing I imagine a case working on is the actual training process of the algorithm rather than the algorithm itself for that exact reason.
Maybe it's a mass delusion but that feels like a stretch.
Also your wording makes this sound entirely like a sinister conspiracy or cash grab. Many people think this is simply a worthy pursuit and the right direction to be looking at the moment.
Save a photo on your computer, open it in a browser or photo viewer, you will get that photo. That is the default behavior of computers. That is not in dispute, is it?
All of this machine learning stuff is trying to get them to not do that. To actually create something new that no one actually stored on them.
Hope that clears up the misunderstanding.
The “color of your bits” only applies to the process of creating a work. Stable Diffusion’s training of the algorithm could be seen as violating copyright but that doesn’t spread to the works generated by it.
In the same vein, one can claim copyright on an image generated by stable diffusion even if the creation of the algorithm is safe from copyright violation.
“some representation of the originals exist inside the model+prompt” is also not sufficient for the model to be in violation of copyright of any one art piece. Some latent representation of the concept of an art piece or style isn’t enough.
It’s also important to note the distinction that there is no training data stored in its original form as part of the model during training, it’s simply used to tweak a function with the purpose of translating text to images. Some could say that’s like using the color from a picture of a car on the internet. Some might say it’s worse but it’s all subjective unless the opposition can draw new ties of the actual technical process to things already precedent.
It seems to me that they're claiming here that Stability has somehow manage to store copies of these images in about 1 byte of space each. That's an incredible compression ratio!
Paintover does not have to mean actual 'tracing', a LOT of artists use photos as direct references and paint over them in a separate layer, keeping the composition, poses, colors very close to the original while still changing details and style enought to make it transformative enough to be considered a 'new work'.
Here are two examples of artist Sam Yang using two still frames from the tv show Squid Game and painting over those, the results which he then sells as prints:
https://www.inprnt.com/gallery/samdoesarts/the-alleyway/ https://www.inprnt.com/gallery/samdoesarts/067/
That said, you could even get away with less transformation and still have it be considered original work, take Andy Warhol's 'Orange Marilyn' and 'Portrait of Mao', those are inked and flat color changes over photographs.
I am not a lawyer but I also assume Microsoft's position, at least in part, is that they can download and use code in GitHub public repos just like anyone else can and developing a public service based on training with that (and a lot of other) code isn't redistributing that code.
My main beef with the approach being taken by a lot of artists towards AI art generators, using copyright in an attempt to kick the tools in the shin (notably not actually kill it, only perhaps slow it down a bit) could set legal precedent that would make it worse for individual artists and smaller groups by putting the most useful and powerful variations of the technology exclusively in the hands of intellectual property hoarders like Disney. As opposed to a more open approach where the possibility exists for useful generators to exist for free.
I'm not anti artist, I do genuinely think that this outcome would make things worse for them and better for the companies that already exploit them.
Kind of like recreating your image one object at a time. It might not be exact, but close enough.
Can't put genie in the bottle, not this one.
Imo something like dall-e or midjourney is much worse.
- Extending the slant roof in the background, it intersects with the left figure at around the height of the nose, but in the painting it intersects with the middle of her neck.
- Similarly the line of the fence on the left is at the height of her hairline, but in the painting it is at the height of the middle of the head, and also more slanted than in the frame.
- On the right side, the white part of the pillar is similarly too low compared to the figure.
- The pole in the background has a lot of things off with regards to size, thickness, or location too.
Essentially, everything is a bit off with regards to location, size and distance. It doesn't really make sense to paint over something and then still do everything differently from the base layer, so it was probably just drawn from reference the normal way -- probably having the picture on another screen and drawing it again from scratch, rather than directly painting over the frame.
I agree with regards to Warhol but that doesn't really establish it as very common amongst painters.
Stable diffusion has something called an encoder and decoder. What the encoder does is it takes an image, finds it's fundamental characteristics, and then converts it into a data point (for the sake of simplicity we will use a vector even though it doesn't have to be). Let's say the vector <0.2,0.5,0.6> represents a black dog. If you took a similar vector, you would get another picture of a dog (say a white dog). These vectors are contained in what's called a latent space which is just a collection of items where similar concepts are close together.
Stable Diffusion uses this latent space because it's more computationally efficient. So what it does is it starts with a noisy image which is converted into latent space, then it slowly gets rid of noise. It does this entire process on the latent space representation as opposed to the actual image. This means it's more computationally efficient because it doesn't have to store an entire pixel image in memory. Once it finishes getting rid of the noise, it uses the decoder to convert the image back into a pixel image. What you'll notice is that throughout this entire process it's not just retrieving a compressed image from it's training set and then using it. Instead, it's generating the image through de-noising. This de-noising process is guided by it's understanding of different concepts that can be represented in the latent space.
I think where this lawsuit goes wrong is it implies that the latent space is literally storing a copy of every image in the dataset. As far as I am aware, this is not true. Even though the latent space representations of images are dramatically smaller, it's not small enough to fit the entire dataset in a 5gb file. The only thing Stable Diffusion is storing is the algorithm itself for converting to and from latent space and that's just for computational efficiency as mentioned above. I've heard that Stable Diffusion might store some key concepts from the latent space, but I don't know if that's true or not. Either way, it seems unlikely that the entire dataset is being stored in Stable Diffusion. To me, it seems that saying Stable Diffusion is storing the images themselves is like saying GZIP's algorithm is storing the compressed version of every file in existence.
Disclaimer: Not an ML expert and this is just based on my own understanding of how it works. So I could be wrong
The way SD model weights work, if you managed to prompt engineer a recreation of one specific work, it would only have been generated as a product of all the information in the entire training set + noise seed + the prompt. And the prompt wouldn't look anything like a reasonable description of any specific work.
Which is to say, it means nothing because you can equally generate a likeness of works which are known not to be included in the training set (easy, you ask for a latent encoding of the image and it gives you one): equivalent to a JPEG codec.
I very much doubt that.
>Secondly, putting strangely much effort into a comment on Hacker News
Note sure what you are implying here, could you elaborate ? The reason I know about these images is because they've been posted, alongside many other similar examples, in discussions regarding AI art.
>I know this because there are too many mistakes with regards to proportion:
Have you ever used programs like Photoshop, Krita et al ? You can start painting directly over a photo, and then easily transform the proportions of all components in the image, and since you draw them in layers, they can be done without affecting eachother.
Here they are, side by side:
https://imgur.com/a/tIbBkk2 https://imgur.com/a/K1fEPtu
I have no doubt that he started painting these over the reference photos, and then used the 'warp tool' in his painting program of choice to alter the proportions, a very common technique.
And this is PERFECTLY FINE, the resulting artwork is transformative enough to be considered a new work of art, which is true for practically every piece of art I've seen generated by Stable Diffusion, the only one I've seen that I'm doubtful about is the 'bloodborne box art' one, which is THE example that is always brought up as it such an outlier.
All of my other points remain unchanged by this pedantry.
SD both creates derivative works and also sometimes creates pixel level copies from portions of the trained data.
Best you can do is to mask and keep inpainting the area that looks different until it doesn't.
You can see his actual workflow on his YouTube channel. He shows his painting process there but doesn't show his sketching process, but I hope that you believe that people are able to draw from imagination at least.
https://www.youtube.com/watch?v=7_ZLBKj_UlY
> Note sure what you are implying here, could you elaborate?
I just meant I was probably putting to much effort into an online discussion.
> I have no doubt that he started painting these over the reference photos, and then used the 'warp tool' in his painting program of choice to alter the proportions, a very common technique.
It's simply not a common technique at all. I'm not sure why you're making these statements because it feels like your knowledge of how illustrators work is extremely limited. I've heard of people photobashing -- which is when artists combine photo manipulation and digital painting to more easily produce realistic artworks. It's got mixed opinions about it and many consider it cheating but within the field of concept art it's common because it's quick and easy. However, there's huge amounts of people who can just draw and paint from sight or imagination. There's the hyperrealists who often act as a human photocopier, but artists who do stylized art of any kind are just people who can draw from imagination. I'm not sure why that's something you "very much doubt" to be quite honest. Just looking on YouTube for things like art timelapses, you can find huge amounts of people who draw entirely from imagination. Take Kim Jung Gi as a somewhat well known example. That guy was famous amongst illustrators for drawing complicated scenes directly in pen without any sketches. But there's really plenty of people that can do these things.
You seem to be under the impression that the average artist uses every shortcut available to get a good result, but that is simply not true. Most artists I know refuse to do anything like photobashing because they consider it cheating and because it isn't how they want to work, nevermind directly drawing on top of things. Drawing from sight isn't uncommon as a way to study art, so in case you're wondering why Sam Yang would be able to reproduce the frame so closely, it's because that's how artists study painting.
> Have you ever used programs like Photoshop, Krita et al
Yes, very often. The thing is: Just because it's possible does not mean it actually happens.
You cannot copyright “any image that resembles Joe Biden”.
That’s said, it does raise the question, “should this precedent be extended to humans?”
i.e. Can humans be taught something based on copyrighted materials in the training set/curriculum?
Why? That's not obvious to me at all.
These algorithms take the entire image and feed it into their maw to generate their neural network. That doesn't really sound like "fair use".
If these GPT systems were only doing scholarly work, there might be an argument. However, the moment the outputs are destined somewhere other than scholarly publications that "fair use" also goes right out the window.
If these algorithms took a 1% chunk of the image, like a collage would, and fed it into their algorithm, they'd have a better argument for "fair use". But, then, you don't have crowdsourced labelling that you can harvest for your training set as the cut down image probably doesn't correspond to all the prompts that the large image does.
> Stable Diffusion does not create 1:1 copies of artwork it has been trained on
What people aren't getting is that what the output looks like doesn't matter. This is a "color of your bits" problem--intent matters.
This was covered when colorizing old black and white films: https://chart.copyrightdata.com/Colorization.html "The Office will register as derivative works those color versions that reveal a certain minimum amount of individual creative human authorship." (Edit: And note that they were colorizing public domain films to dodge the question of original copyright.)
The current algorithms injest entire images with the intent to generate new images from them. There is no "extra thing" being injected by a human--there is a direct correspondence and the same inputs always produce the same outputs. The output is deterministically derived from the input (input images/text prompt/any internal random number generators).
You don't get to claim a new copyright or fair use just because you bumped a red channel 1%. GPT is a bit more complicated than that, but not very different in spirit.
LAION-5b is also just an indexer (in terms of images).
To address (b) first: Fair Use has long held that educational purposes are a valid reason for using copyrighted materials without express permission—for instance, showing a whole class a VHS or DVD, which would technically require a separate release otherwise.
For (a): I don't know anything about your background in ML, so pardon if this is all obvious, but at least current neural nets and other ML programs are not "AI" in anything like the kind of sense where "teaching" is an apt word to describe the process of creating the model. Certainly the reasoning behind the Fair Use exception for educating humans does not apply—there is no mind there to better; no person to improve the life, understanding, or skills of.
There are arguments to be made for fair use--I'm just not sure the current crop of GPT falls under any of them.
But the fact that it often generates new content, that didn’t exist before, or at least doesn’t breach the limits of fair use, goes against the argument made in the lawsuit.
I think this is the most relevant line of your argument. Because if you could just ask it like "show me the latest picture of [artist]" then you'll have a hard time convincing me that this is fundamentally different from a database with a fancy query language and lots of copyrighted work in it.
In GPT this is words and phrases, e.g. "Frodo Baggins" high affinity, "Frodo Superman" will be negligible. Now consider all words that may link to those words - potentially billions of words (or phrases), but (probably/hopefully) none replicated. The phrases are out of specific context because they cover _all contexts_ in the training data. When you speak to GPT it randomises these words in response to you, typically choosing the words/phrases with the highest affinity, to the words you prompted, this almost gives it the appearance of emergent AI, because it is crossing different concepts (texts) in it's answers.
Stable Diffusion works similarly but with colours (words), and patterns/styles (phrases). Now if you ask for a green field in the style of Van Gogh, it could compare Van Gogh's work to a backdrop from Windows XP. You could argue depending on the degree of those things it gives you you are violating copyrights, however that narrow view doesn't take into account that although you've specifically asked for Van Gogh and that's where it concentrates, it's also pulling in work from potentially hundreds of other lower affinity sources. It's this dilution which means you'll never see an untainted original source image.
So in essence, it's the user who is breaching the copyright by specifying concentration on specific terms in the prompt, not the model. The model is simply a set of patterns, and the user is making those patterns breach copyright which IMHO is no different to the user copying a painting with a brush.
The brush isn't the thing you sue.
So humans can already run afoul of copyright this way, the bar for NNs might end up lower.
Social constructs are not computer programs. Social constructs concern messy, unpredictable computing units called humans.
Precedent and continuity are something that US courts normally try to value. Yes, the rules can be fuzzy, but the courts generally tried to balance the needs of the competing parties. Unfortunately, there will never be a purely "rules based" decision tree on this kind of "fuzzy" thing.
Of course, recent Republican court appointments have torn up the idea of precedent and minimizing disruption in preference to partisan principles, so your concerns aren't unwarranted.
I'm personally not against the idea of "AI as a tool to help." I just think with art, and the way the AI art software works, it's not a helping tool; aside playing around with it for fun, it's a "quick riches" type tool, more like faux leather/Pleather/PU leather.
The important question being, why would you pay for these tools? All I've seen are articles around how well they can create art 'in the style of X' as the exciting bit.
- Open Microsoft Paint
- Make a blank 400 x 400 image
- Select a pixel and input an R,G,B value
- Repeat the last two steps
To reproduce a copyrighted work. I'm sure people have done this with e.g. pixel art images of copyrighted IP of Mario or Link. At 400x400, it would take 160,000 pixels to do this. At 1 second per pixel, a human being could do this in about a week.
Because people have the capability of doing this, and in fact we have proof that people have done so using tools such as MS paint, AND because it is unlikely but possible that someone could reproduce protected IP using such a method, should we ban Microsoft Paint, or the paint tool, or the ability to input raw RGB inputs?
In this sense, stable diffusion is more analogous to the JPEG algorithm than it is to a specific collection of JPEG files. As it stands, the originals trainng data is not stored, even in a compressed way.
This is an intelligence augmentation tool. It’s effectively like I’m really good at reading billions of lines of code and incorporating the learnings into my own code. If you don’t want people learning from your code, don’t publish it.
It seems almost everyone here in this thread is fine with such a grift on digital artists but when it is Copilot or ChatGPT; two years ago it was: 'Hardly going to compete against developers', with ChatGPT it became 'But juniors are only affected, not us seniors' and with GPT-4 + Copilot it will be: 'Please stop using AI code and sue GitHub now!'
Obviously this wasn't the case with Dance Diffusion (music version of Stable Diffusion) and that was trained on public domain music or the permission of musicians. It is almost as if that they knew if they did train it on copyrighted music and released it as open source, Stability AI would be out of business before they could counter the lawsuit. [0]
It is indeed a grift and the legal system will catch up on both Copilot and Stable Diffusion on using for copyrighted content in the training set of their AI models.
[0] https://techcrunch.com/2022/10/07/ai-music-generator-dance-d...
I bet a proper analysis of that toy experiment would conclude that none of the original data points are perfectly recovered: Only the underlying distribution / manifold is recovered, which really doesn't lend well to their argument at all.
At some point the input must be considered part of the work. At the limit you could just describe every pixel, but that certainly wouldn’t mean the model contained the work.
While I doubt that specific case has been tested in court, arguably you could. If you created glitch art (https://en.wikipedia.org/wiki/Glitch_art) via compression artifacts, and your work was sufficiently distinct from the original work, I think you would have a reasonable case for transformative use (https://en.wikipedia.org/wiki/Transformative_use).
Many state-of-the-art compression algorithms are in fact based on generative models. But the thing is, the model weights themselves are not the compressed representation.
The trained model is the compression algorithm (or more technically, a component of it... as it needs to be combined with some kind of entropy coding).
You could use Stable Diffusion to compress and store the training data if you wanted, but nobody is doing that.
It's not sufficient to just consider whether it can reproduce an image, but also how much user input was required to do so.
For example, if a publish a music remix tool with a massive database of existing music, creators might use to create collages that are original and fall under fair use. But the tool itself is not and requires permission from the rights owners.
Copyright, and laws in general, exists to protect the human members of society not some abstract representation of them.
So it would be quite easy to make a trademark laundering operation, in theory.
Not the way it's used in Stable Diffusion models. Compressed data can be decompressed knowing only the decompression algorithm. To recover data from a stable diffusion model, you need to know the algorithm and the prompt.
A critical part of the information _isn't_ in the data you decompress, it has to come from you. (And this isn't that relevant, but it would be lossy, perceptual compression like jpeg or mp3, not lossless compression like Huffman or Arithmetic coding.)
The fact that Stabiliity is now creating an opt-out for artists after lifting and training on copyrighted / watermarked art without permission and creating a paid SaaS solution out of it, shows that not only they willfully trampled on the copyright of artists, but they have set themselves on a weak explanation on the 'fair use, transformative' argument since the LAION-5B model contains the copyrighted images which can output verbatim / highly similar digital art by the model.
The input from 'real experts like Bengio, LeCun' add little to no value in the case as digital art generated by a non-human is uncopyrightable and is public domain by default. [1] What sets the precedent is whether if using copyrighted content in a training set without permission from the author and outputting verbatim or highly similar derived works and commercializing that is fair use and not infringing.
If SD drew this line or musicians, then that should be the line drawn for digital art and code, and both Copilot, Midjourney, SD and DALL-E should be trained on public domain content or content with the permission from the content author or licenses that allow AI training.
So far, the 'real' grifters are Stability AI, OpenAI and Midjourney.
[0] https://techcrunch.com/2022/10/07/ai-music-generator-dance-d...
[1] https://www.copyright.gov/rulings-filings/review-board/docs/...
Me having bought the magazines also has nothing to do with anything. Would apply equally if they were gifted or free or stolen.
As a thought experiment, imagine a variant of something like SD was used for music generation rather than images. It was trained on all music on spotify and it is marketed as a paid tool for producers and artists. If the model reproduces specific sounds from certain songs, e.g. the specific beat from a song, hook, or melody, it would seem pretty straightforward that the generated content was derivative, even though only a feature of it was precisely reproduced. I could be wrong but as far as i am aware you need to get permission to use samples. Even if the content is not published those sounds are being sold by the company as inspiration, and therefore that should violate copyright. The training data is paramount because if you trained the model on stuff you generated yourself or on stuff with appropriate CC license, the resulting work would not violate copyright, or you could at least argue independent creation.
In the feature space of images and art, SD is doing something very similar, so i can see the argument that it violates copyright even without reproducing the whole training data.
Overall, i think we will ultimately need to decide how we want these technologies used, what restrictions should be on the training data, etc, and then create new laws specifically for the new technology, rather than trying to shoehorn it into existing copyright law.
They are going to have to show that the model copies ALL source images with perfect retention, and they are 100 percent full of shit if they think they can demonstrate that. What you may find is that some models out there are heavily biased on source images and can produce some outputs that are too similar to original works, in that case, there may be an issue.
Good luck with that - DevianArt doesn't produce work, just hosts it, which is mostly indeterminable from human input, and Midjourney use their own engine trained on different data. Advertising the fact you've not done adequate due diligence before public announcement of intent to sue doesn't (imho) give me the best impression of your lawyering chops.
A trained model holds relationships between patterns/colours in artwork and their affinity to the other images in the model (ignoring the English tagging of images data within this model for a minute). To this degree, it holds relationships between millions of images and the degree of similarities (i.e. affinity weighting of the patterns within them) in a big blob (the model).
When you ask for a dragon by $ARTIST it will find within it's model an area of data with high affinity to a dragon and that of $ARTIST. What has been glossed over in discussion here is that there are millions of other bits of related images - that have lower affinity - from lots of unrelated artwork which gives the generated image uniqueness. Because of this, you can never recreate 1:1 the original image, it's always diluted by the relationships from the huge mass of other training data, e.g. a colour from a dinosaur exhibit in a museum may also be incorporated as it looks like a dragon, along with many other minor traits from millions of other images, chosen at random (and other seed values).
Another interesting point is that a picture of a smiling dark haired woman would have high affinity with Mona Lisa, but when you prompt for Mona Lisa you may get parts of that back and not the patterns from the Mona Lisa*, even though it looks the same. That arguably (not getting Mona Lisa) is no longer the copyrighted data.
* Nb. this is a contrived example, since in SD the real Mona Lisa weightings will out number the individual dark haired woman's many times, however this concept might be (more) appropriate for minor artists whose work is not popular enough to form a significantly large amount of weighting in the training data.
But it's virtually impossible for these models to make an exact replica – a photocopy - of an existing painting, because that would make it break some laws of information theory probably. It's not a lossless compression engine. Paintings like, "Girl With the Pearl Earring" appear so frequently in the datasets that the models tend to overfit on them – which is actually not something you want when designing a model. It tends to create issues for you. But that's why a painting like that can be simulated somewhat accurately. But even then – it's never going to be 100%.
It's like the compression that occurs when I say "Mona Lisa" and you read it, and can know many aspects of that painting.
Legislation is driven by people who are, on aggregate, not autistic. So it's entirely appropriate to presume that a person not understanding how that process works is indeed autistic, especially if they suggest machines are subjects of law by analogy with human beings.
It's not that autists are bad people, they are just outliers in the political spectrum, as you can see from the complete disconnect of up-voted AI-related comments on Hacker News, where autistic engineers are clearly over-represented, versus just about any venue where other professionals, such as painters or musicians, congregate. Just try to suggest to them that a corporation has the right to use their work for free and profit from it while leaving them unemployed, because the algorithm the corporation uses to exploit them is in some abstract sense similar to how their brain works. That position is so for out on the spectrum that presuming a personality peculiarity of the emitter is the absolutely most charitable interpretation.
So while it would be possible to create a "Public Diffusion" that took the Stable Diffusion refinements of the ML techniques and created a model built solely out of public-domain art, as it stands, "Stable Diffusion" includes by definition the model that is built from the copyrighted works in question.
Disclaimer: Not an ML expert
I don't deny that this might be a worthy pursuit or the right direction to be looking, or that that's the reason some people are in it. I just question the motivations of a private company valued at $10b which is going to have a lot more control over the direction of the industry than those passionate individuals.