The law doesn't recognize a mathematical computer transformation as creating a new work with original copyright.
If you give me an image, and I encrypt it with a randomly generated password, and then don't write down the password anywhere, the resulting file will be indistinguishable from random noise. No one can possibly derive the original image from it. But, it's still copyrighted by the original artist as long as they can show "This started as my image, and a machine made a rote mathematical transformation to it" because machine's making rote mathematical transformations cannot create new copyright.
The argument for stable diffusion would be that even if you cannot point to any image, since only algorithmic changes happened to the inputs, without any human creativity, the output is a derived work which does not have its own unique copyright.
Do you have evidence that this is actually what the courts have decided with respect to NNs?
The idea, in stage one, was to split a file into chunks and xor those with other random chunks (equivalent to a one-time pad), those chunks as well as the created random chunks then got shared around the networks, with nobody hosting both parts of a pair.
The next stage is that future files inserted into the network would not create new random chunks but randomly use existing chunks already in the network. The result is a distributed store of chunks each of which is provably capable of generating any other chunk given the right pair. The correlations are then stored in a separate manifest.
It feels like such a system is some kind of entropy coding system. In the limit the manifest becomes the same size as the original data. At the same time though, you can prove that any given chunk contains no information. I love thinking about how the philosophy of information theory interacts with the law.
Yes, on a technical level, those chunks are random data. On the legal side, however, those chunks are illegal copyright infringement because that is their intent, and there is a process that allows the intent to happen.
I can't really say it better than this post does, so I highly recommend reading it: https://ansuz.sooke.bc.ca/entry/23
If you take a bad paper shredder that, say, shreds a photo into large re-usable chunks, run the photo through that, and tape the large re-usable chunks back together, you have a photo with the same copyright as before.
If you tape them together in a new creative arrangement, you might apply enough human creativity to create a new copyrighted work.
If you grind the original to dust, and then have a mechanical process somehow mechanically re-arrange the pieces back into an image without applying creativity, then the new mechanically created arrangement would, I suspect, be a derived work.
Of course, such a process don't really exist, so for the "shapeless dust" question, it's pretty pointless to think about. However, stable diffusion is grinding images down into neural networks, and then without a significant amount of human creativity involved, creating images reconstituted from that dust.
Perhaps the prompt counts as human creativity, but that seems fairly unlikely. After all, you can give it a prompt of 'dog' and get reconstituted dust, that hardly seems like it clears a bar.
Perhaps the training process somehow injected human creativity, but that also seems difficult to argue, it's an algorithm.
But that's not what people use Stable Diffusion for: people use Stable Diffusion to create new works which don't previously exist as that combination of colors/bytes/etc.
Artists don't have copyright on their artistic style, process, technique or subject matter - only on the actual artwork they output or reasonable similarities. But "reasonable similarity" covers exactly that intent - an intent to simply recreate the original.
People keep talking about copyright, but no one's trying to rip off actual existing work. They're doing things like "Pixar style, ultra detailed gundam in a flower garden". So you're rocking up in court saying "the intent is to steal my clients work" - but where is the clients line of gundam horticultural representations? It doesn't exist.
You can't copyright artistic style, only actual output. Artists are fearful that the ability to emulate style means commissions will dry up (this is true) but you've never had copyright protection over style, and it's not even remotely clear how that would work (and, IMO, it would be catastrophic if it was - there's exactly one group of megacorps who would now be in a position to sue everyone because try defining "style" in a legal sense).
I could always show that there exists some function f that produces said byte sequence when applied to my copyrighted material.
Can I sue Microsoft because the entire Windows 11 codebase is just one "rote mathematical transformation" away from the essay I wrote in elementary school?
Sure, the windows 11 codebase is in pi somewhere if you go far enough. Sure, pi is a non-copyrightable fact of nature. That doesn't mean the windows codebase is _actually_ in pi legally, just that it technically is.
The law does not care about weird gotchas like you describe.
I recommended reading this to a sibling comment, and I'll recommend it to you too: https://ansuz.sooke.bc.ca/entry/23
Yes, copyright law has obviously irrational results if you start trying to look at it only from a technical "but information is just 1s and 0s, you can't copyright 1s and 0s" perspective. The law does not care.
Which is why we have to think about the high level legal process that stable diffusion does, not so much the actual small technical details like "can you recover images from the neural net" or such.
Copyright infringement can happen without intending to infringe copyright.
Various music copyright cases start with "Artist X sampled some music from artist Y, thinking it was transformative and fair use". The court, in some of these cases, have found something the artist _intended_ to be transformative to in fact be copyright infringement.
> You can't copyright artistic style, only actual output
You copyright outputs, and then works that are derived from those outputs are potentially copyrighted. Stable Diffusion's outputs are clearly defined from the training set, basically by definition of what neural networks are.
It's less clear they're definitely copyright-infringing derivative works, but it's far less clearcut than how you're phrasing it.
Obviously some fairy reputable organisations and individuals are moderately confident that there isn't otherwise they wouldn't have done it.
If my parrot recites your song after hearing my alleged infringement, I record its performance and post it on YouTube is that infringement?
Last one, if I use the song from your website to train an song recognition AI is that infringement?
"Training a neural network" is an implementation detail. These companies accessed millions of copyrighted works, encoded them such that the copyright was unenforcable, then sell the output of that transformation.
If my parrot recites your song after hearing it and I record that and upload to YouTube. I've violated your copyright.
If a big company does the same(runs the song through a non-human process, then sells the output) I believe they're blatantly infringing copyright.
I think the post you’re replying to saw was confused about the quote above. The person who’s claiming copyright by showing the claimed file started as their own image has to show that it started from their own image, and not just that the file could have derived from the image. Copyright cares about both the works and the provenance of works.
Stable Diffusion couldn’t be flagged under this pretense if a person used a prompt that was their own nor could they even be sued if they ran an image through it as long as there is no plausibility that it was made by a copyright work. The only thing I imagine a case working on is the actual training process of the algorithm rather than the algorithm itself for that exact reason.
Maybe it's a mass delusion but that feels like a stretch.
Also your wording makes this sound entirely like a sinister conspiracy or cash grab. Many people think this is simply a worthy pursuit and the right direction to be looking at the moment.
I don't deny that this might be a worthy pursuit or the right direction to be looking, or that that's the reason some people are in it. I just question the motivations of a private company valued at $10b which is going to have a lot more control over the direction of the industry than those passionate individuals.