zlacker

[parent] [thread] 34 comments
1. TheDon+(OP)[view] [source] 2023-01-14 07:36:40
It doesn't matter if they exist as exact copies in my opinion.

The law doesn't recognize a mathematical computer transformation as creating a new work with original copyright.

If you give me an image, and I encrypt it with a randomly generated password, and then don't write down the password anywhere, the resulting file will be indistinguishable from random noise. No one can possibly derive the original image from it. But, it's still copyrighted by the original artist as long as they can show "This started as my image, and a machine made a rote mathematical transformation to it" because machine's making rote mathematical transformations cannot create new copyright.

The argument for stable diffusion would be that even if you cannot point to any image, since only algorithmic changes happened to the inputs, without any human creativity, the output is a derived work which does not have its own unique copyright.

replies(7): >>dymk+O >>limite+U >>mbgerr+r2 >>hgomer+s4 >>Last5D+Kd >>street+Gy >>michae+3S1
2. dymk+O[view] [source] 2023-01-14 07:48:55
>>TheDon+(OP)
> But, it's still copyrighted by the original artist as long as they can show "This started as my image, and a machine made a rote mathematical transformation to it" because machine's making rote mathematical transformations cannot create new copyright.

Do you have evidence that this is actually what the courts have decided with respect to NNs?

3. limite+U[view] [source] 2023-01-14 07:49:38
>>TheDon+(OP)
So what happens if you put a painting into a mechanical grinder? Is the shapeless pile of dust still copyrighted work? I don’t think so.
replies(2): >>jimnot+y1 >>TheDon+06
◧◩
4. jimnot+y1[view] [source] [discussion] 2023-01-14 07:59:05
>>limite+U
The owner of that Banksy painting certainly thinks so.
replies(1): >>limite+02
◧◩◪
5. limite+02[view] [source] [discussion] 2023-01-14 08:03:45
>>jimnot+y1
The painting that has several cuts in about 25% of the surface area? Don’t think that constitutes as a shapeless pile of dust.
replies(1): >>jimnot+e4
6. mbgerr+r2[view] [source] 2023-01-14 08:08:18
>>TheDon+(OP)
No. Humans decided to include artwork that they did not have any right to use as part of a training data set. This is about holding humans accountable for their actions.
replies(2): >>dymk+v3 >>andyba+e8
◧◩
7. dymk+v3[view] [source] [discussion] 2023-01-14 08:18:55
>>mbgerr+r2
“Did they have a right to use publicly posted images” is up for the courts to decide
replies(1): >>cudgy+Vf
◧◩◪◨
8. jimnot+e4[view] [source] [discussion] 2023-01-14 08:25:25
>>limite+02
So what % does?
replies(1): >>limite+PD1
9. hgomer+s4[view] [source] 2023-01-14 08:27:45
>>TheDon+(OP)
Some years ago I had an idea to have a method of file sharing with strong plausible deniability from the sharer.

The idea, in stage one, was to split a file into chunks and xor those with other random chunks (equivalent to a one-time pad), those chunks as well as the created random chunks then got shared around the networks, with nobody hosting both parts of a pair.

The next stage is that future files inserted into the network would not create new random chunks but randomly use existing chunks already in the network. The result is a distributed store of chunks each of which is provably capable of generating any other chunk given the right pair. The correlations are then stored in a separate manifest.

It feels like such a system is some kind of entropy coding system. In the limit the manifest becomes the same size as the original data. At the same time though, you can prove that any given chunk contains no information. I love thinking about how the philosophy of information theory interacts with the law.

replies(1): >>TheDon+75
◧◩
10. TheDon+75[view] [source] [discussion] 2023-01-14 08:35:09
>>hgomer+s4
I think this touches on the core mismatch between the legal perspective and technical perspective.

Yes, on a technical level, those chunks are random data. On the legal side, however, those chunks are illegal copyright infringement because that is their intent, and there is a process that allows the intent to happen.

I can't really say it better than this post does, so I highly recommend reading it: https://ansuz.sooke.bc.ca/entry/23

replies(2): >>XorNot+k6 >>hgomer+JW2
◧◩
11. TheDon+06[view] [source] [discussion] 2023-01-14 08:43:56
>>limite+U
Maybe?

If you take a bad paper shredder that, say, shreds a photo into large re-usable chunks, run the photo through that, and tape the large re-usable chunks back together, you have a photo with the same copyright as before.

If you tape them together in a new creative arrangement, you might apply enough human creativity to create a new copyrighted work.

If you grind the original to dust, and then have a mechanical process somehow mechanically re-arrange the pieces back into an image without applying creativity, then the new mechanically created arrangement would, I suspect, be a derived work.

Of course, such a process don't really exist, so for the "shapeless dust" question, it's pretty pointless to think about. However, stable diffusion is grinding images down into neural networks, and then without a significant amount of human creativity involved, creating images reconstituted from that dust.

Perhaps the prompt counts as human creativity, but that seems fairly unlikely. After all, you can give it a prompt of 'dog' and get reconstituted dust, that hardly seems like it clears a bar.

Perhaps the training process somehow injected human creativity, but that also seems difficult to argue, it's an algorithm.

◧◩◪
12. XorNot+k6[view] [source] [discussion] 2023-01-14 08:47:25
>>TheDon+75
Except you've a heckin' problem with Stable Diffusion because you have to argue that the intent is to steal the copyright by copying already existing artworks.

But that's not what people use Stable Diffusion for: people use Stable Diffusion to create new works which don't previously exist as that combination of colors/bytes/etc.

Artists don't have copyright on their artistic style, process, technique or subject matter - only on the actual artwork they output or reasonable similarities. But "reasonable similarity" covers exactly that intent - an intent to simply recreate the original.

People keep talking about copyright, but no one's trying to rip off actual existing work. They're doing things like "Pixar style, ultra detailed gundam in a flower garden". So you're rocking up in court saying "the intent is to steal my clients work" - but where is the clients line of gundam horticultural representations? It doesn't exist.

You can't copyright artistic style, only actual output. Artists are fearful that the ability to emulate style means commissions will dry up (this is true) but you've never had copyright protection over style, and it's not even remotely clear how that would work (and, IMO, it would be catastrophic if it was - there's exactly one group of megacorps who would now be in a position to sue everyone because try defining "style" in a legal sense).

replies(1): >>TheDon+Kk
◧◩
13. andyba+e8[view] [source] [discussion] 2023-01-14 09:06:44
>>mbgerr+r2
"right" in the informal sense or in some legal sense?
replies(1): >>mbgerr+nk
14. Last5D+Kd[view] [source] 2023-01-14 10:12:48
>>TheDon+(OP)
This surely can't be the case, right? If it was, then what's stopping me from taking any possible byte sequence and applying my copyright to it?

I could always show that there exists some function f that produces said byte sequence when applied to my copyrighted material.

Can I sue Microsoft because the entire Windows 11 codebase is just one "rote mathematical transformation" away from the essay I wrote in elementary school?

replies(1): >>TheDon+Ki
◧◩◪
15. cudgy+Vf[view] [source] [discussion] 2023-01-14 10:36:12
>>dymk+v3
Pretty sure that’s already decided. Publicly played movies and music are not available to be used. Why would the same not apply to posted images?
replies(2): >>dymk+iG >>UncleE+nN
◧◩
16. TheDon+Ki[view] [source] [discussion] 2023-01-14 11:11:27
>>Last5D+Kd
The law doesn't care about technical tricks. It cares about how you got the bytes and what humans think of them.

Sure, the windows 11 codebase is in pi somewhere if you go far enough. Sure, pi is a non-copyrightable fact of nature. That doesn't mean the windows codebase is _actually_ in pi legally, just that it technically is.

The law does not care about weird gotchas like you describe.

I recommended reading this to a sibling comment, and I'll recommend it to you too: https://ansuz.sooke.bc.ca/entry/23

Yes, copyright law has obviously irrational results if you start trying to look at it only from a technical "but information is just 1s and 0s, you can't copyright 1s and 0s" perspective. The law does not care.

Which is why we have to think about the high level legal process that stable diffusion does, not so much the actual small technical details like "can you recover images from the neural net" or such.

replies(2): >>derang+Fb1 >>93po+WD1
◧◩◪
17. mbgerr+nk[view] [source] [discussion] 2023-01-14 11:27:28
>>andyba+e8
Legal
replies(1): >>andyba+5x
◧◩◪◨
18. TheDon+Kk[view] [source] [discussion] 2023-01-14 11:30:08
>>XorNot+k6
> because you have to argue that the intent is to steal the copyright by copying already existing artworks

Copyright infringement can happen without intending to infringe copyright.

Various music copyright cases start with "Artist X sampled some music from artist Y, thinking it was transformative and fair use". The court, in some of these cases, have found something the artist _intended_ to be transformative to in fact be copyright infringement.

> You can't copyright artistic style, only actual output

You copyright outputs, and then works that are derived from those outputs are potentially copyrighted. Stable Diffusion's outputs are clearly defined from the training set, basically by definition of what neural networks are.

It's less clear they're definitely copyright-infringing derivative works, but it's far less clearcut than how you're phrasing it.

◧◩◪◨
19. andyba+5x[view] [source] [discussion] 2023-01-14 13:31:01
>>mbgerr+nk
Can you clarify? My understanding is that it's very unclear whether there are any legal issues (in most jurisdictions) in scraping for training.

Obviously some fairy reputable organisations and individuals are moderately confident that there isn't otherwise they wouldn't have done it.

replies(1): >>Xelyne+4T
20. street+Gy[view] [source] 2023-01-14 13:47:21
>>TheDon+(OP)
The only problem is that computers (I.e. most computers) cannot really generate random numbers.
◧◩◪◨
21. dymk+iG[view] [source] [discussion] 2023-01-14 14:58:21
>>cudgy+Vf
What court case set the president that you can’t train a neural network on publicly posted movies and audio?
replies(1): >>Xelyne+6S
◧◩◪◨
22. UncleE+nN[view] [source] [discussion] 2023-01-14 15:51:34
>>cudgy+Vf
If you post a song on your website and I listen to it am I violating your copyright?

If my parrot recites your song after hearing my alleged infringement, I record its performance and post it on YouTube is that infringement?

Last one, if I use the song from your website to train an song recognition AI is that infringement?

replies(1): >>Xelyne+pS
◧◩◪◨⬒
23. Xelyne+6S[view] [source] [discussion] 2023-01-14 16:28:52
>>dymk+iG
I'd assume the precedent would be about sharing encoder data, which would be covered in bittorrent cases.

"Training a neural network" is an implementation detail. These companies accessed millions of copyrighted works, encoded them such that the copyright was unenforcable, then sell the output of that transformation.

replies(1): >>dymk+bT
◧◩◪◨⬒
24. Xelyne+pS[view] [source] [discussion] 2023-01-14 16:31:55
>>UncleE+nN
If I host a song I don't have license to on my website I'm violating copyright by distributing it to you when you listen on my site.

If my parrot recites your song after hearing it and I record that and upload to YouTube. I've violated your copyright.

If a big company does the same(runs the song through a non-human process, then sells the output) I believe they're blatantly infringing copyright.

replies(2): >>dymk+vT >>UncleE+t32
◧◩◪◨⬒
25. Xelyne+4T[view] [source] [discussion] 2023-01-14 16:36:04
>>andyba+5x
"It's very unclear" in legal cases is synonymous with "it hasn't been challenged in court yet". You say they're moderately confident because they're fairly reputable, but remember that Madoff was a "reputable business man" for the 20 years he ran a ponzi scheme. They don't have to be confident in the legality to do it, they just had to be confident in the potential profit. With openai being values at $10B by Microsoft, I'd say they've successfully muddied the legal waters long enough to cash out.
replies(1): >>andyba+6c1
◧◩◪◨⬒⬓
26. dymk+bT[view] [source] [discussion] 2023-01-14 16:37:28
>>Xelyne+6S
Not being able to reproduce the inputs (each image is contributing single bytes to the neural network) is relevant. Torrent files are a means to exactly reproduce their inputs. Diffusion models are trained to not reproduce their inputs, nor do they have the means to.
◧◩◪◨⬒⬓
27. dymk+vT[view] [source] [discussion] 2023-01-14 16:40:00
>>Xelyne+pS
Big Company is not distributing the input images by distributing the neural network. There is no way to extract even a single input image out of a diffusion model.
◧◩◪
28. derang+Fb1[view] [source] [discussion] 2023-01-14 18:37:46
>>TheDon+Ki
>But, it's still copyrighted by the original artist as long as they can show "This started as my image, and a machine made a rote mathematical transformation to it"

I think the post you’re replying to saw was confused about the quote above. The person who’s claiming copyright by showing the claimed file started as their own image has to show that it started from their own image, and not just that the file could have derived from the image. Copyright cares about both the works and the provenance of works.

Stable Diffusion couldn’t be flagged under this pretense if a person used a prompt that was their own nor could they even be sued if they ran an image through it as long as there is no plausibility that it was made by a copyright work. The only thing I imagine a case working on is the actual training process of the algorithm rather than the algorithm itself for that exact reason.

◧◩◪◨⬒⬓
29. andyba+6c1[view] [source] [discussion] 2023-01-14 18:40:21
>>Xelyne+4T
That's one company. There's dozens if not hundreds of companies, research groups and individuals working under the same assumption.

Maybe it's a mass delusion but that feels like a stretch.

Also your wording makes this sound entirely like a sinister conspiracy or cash grab. Many people think this is simply a worthy pursuit and the right direction to be looking at the moment.

replies(1): >>Xelyne+Iwl
◧◩◪◨⬒
30. limite+PD1[view] [source] [discussion] 2023-01-14 21:43:25
>>jimnot+e4
Not a lawyer, but practically speaking copyright is lost when an item ceases to “exist” itself and can’t be restored. If you cut a painting in half - it’s absolutely still a copyrighted item. If you atomize it and don’t have technology to restore it, then copyright is meaningless. What item exactly is copyrighted?
◧◩◪
31. 93po+WD1[view] [source] [discussion] 2023-01-14 21:43:55
>>TheDon+Ki
I find this hard to believe. If I took the famous pointillism painting "A Sunday Afternoon on the Island of La Grande Jatte" (the one-color-per-dot painting of a park) and I rearranged every color point based on an algorithm to create something that looks nothing like the original (and likely just looks like a jumbled mess), surely the copyright on the existing painting (which I doubt exists anymore) wouldn't prevent me from copyrighting my "new" work.
32. michae+3S1[view] [source] 2023-01-14 23:56:16
>>TheDon+(OP)
I understood it less as transforming the images and more of deriving math formulas from the patterns in the image, closer to creating a bar graph to understand data than making a copy.
◧◩◪◨⬒⬓
33. UncleE+t32[view] [source] [discussion] 2023-01-15 02:00:38
>>Xelyne+pS
I should have specified the OP has legal rights to the song and the end user listening was under the same granted/implied license as a program doing the web harvesting, my bad.
◧◩◪
34. hgomer+JW2[view] [source] [discussion] 2023-01-15 14:05:15
>>TheDon+75
That's an interesting essay and I agree it goes to the heart of the question. There's clearly an interesting question, even in the colour domain: is someone infringing copyright if the data they themselves are sharing has a perfectly legitimate colour that is the basis of their sharing? That's the plausible deniability bit that's so important: "Yes your honour, I did share that chunk of random data, but I did so because it's part of this totally legitimately coloured file I was wanting to share. I had no idea that someone added a new colour to the block. Obviously, I'm only sharing the original colour block; prove otherwise". At some point, the court has to decide the colour of the block from the perspective of the accused, which allows a basis for deniability.
◧◩◪◨⬒⬓⬔
35. Xelyne+Iwl[view] [source] [discussion] 2023-01-20 22:20:56
>>andyba+6c1
If I make it sound like a sinister conspiracy or cash grab, that's because that's what it as long as it's a private entity and not a public endeavor.

I don't deny that this might be a worthy pursuit or the right direction to be looking, or that that's the reason some people are in it. I just question the motivations of a private company valued at $10b which is going to have a lot more control over the direction of the industry than those passionate individuals.

[go to top]