zlacker

But they are not original works, they are wholly derived works of the training data set. Take that data set away and the algorithm is unable to produce a single original pixel.

The fact that the derivation involves millions of works as opposed to a single one is immaterial for the copyright issue.

replies(8): >>realus+D1 >>forgot+12 >>basch+eE >>willia+8X >>smegge+491 >>rule72+JN1 >>bobbru+3V1 >>rsuelz+212

>>manhol+(OP)
The training data set is indeed mandatory but that doesn't make the resulting model a derivative in itself. In fact the training is specifically made to remove derivatives.

replies(1): >>IncRnd+L2

>>manhol+(OP)
If I were to take the first word from a thousand books and use it to write my own would I be guilty of copyright violations?

replies(1): >>yazadd+t4

>>realus+D1
Go to stablediffusionweb.com and enter "a person like biden" into the box. You will see a picture exactly like President Biden. That picture will have been derived from the trained images of Joe Biden. That cannot be in dispute.

replies(3): >>realus+w3 >>willia+A11 >>rsuelz+h12

>>IncRnd+L2
Just because it generates you an image like Biden still does not make it a derivative either.

You can draw Biden yourself if you're talented and it's not considered a derivative of anything.

replies(3): >>IncRnd+84 >>yazadd+l5 >>bluefi+nP

>>realus+w3
There is no need for rhetorical games. The actual issue is that Stable Diffusion does create derivatives of copyrighted works. In some cases the produced images contain pixel level details from the originals. [1]

[1] https://arxiv.org/pdf/2212.03860.pdf

replies(1): >>realus+x4

>>forgot+12
Words have a special carve out in copyright law / precedent. So much so that a whole other category of Intellectual Property exists called Trademarks to protect special words.

But back to your point “if you were to take the first sentence from a thousand books and use it in your own book”, then yes based on my understanding (I am not a lawyer) of copyright you would be in violation of IP laws.

replies(1): >>basch+PE

>>IncRnd+84
> The actual issue is that Stable Diffusion does create derivatives of copyrighted works.

Nothing points to that, in fact even in this website they had to lie on how stablediffusion actually works, maybe a sign that their argument isn't really solid enough.

> [1] https://arxiv.org/pdf/2212.03860.pdf

You realize those are considered defects of the model right? Sure, this model isn't perfect and will be improved.

replies(1): >>IncRnd+95

>>realus+x4
> You realize those are considered defects of the model right? Sure, this model isn't perfect.

You can call copying of input as a defect, but why are you simultaneously arguing that it doesn't occur?

replies(1): >>realus+T5

>>realus+w3
Correction: if you draw a copy of Biden and it happens to overlap enough with someone’s copyright of a drawing or image of Biden, you did create a derivative (whether you knew it or not).

replies(1): >>realus+28

>>IncRnd+95
I don't call these defects copying either but overfitting characteristics. Usually they are there because there's a massive amount of near-identical images.

It's both undesirable and not relevant to this kind of lawsuit.

>>yazadd+l5
is that really how copyright law works? Drawing something similar independently is considered a derivative even if there's no links to it?

It's bad news for art websites themselves if that's the case...

replies(1): >>techdr+Pp

>>realus+28
No that’s not… at least in many countries. Unlike patents, “parallel creation” is allowed, this was fought out in case law over photography decades ago, because photographers would take images of the same subject, then someone else would, and they might incidentally capture a similar image for lots of reasons and thus before ubiquitous photography in our pockets, when you had to have expensive equipment or carefully control the lighting in a portraiture studio to get great results… well it happened and people sued like those with money to spare for lawyers are want to do, and thus precedent has been established for much of this. You don’t see it a lot outside photography but it’s not a new thing for art copyright law and I think the necessity of the user to provide their own input and get different outcomes outside of extremely sophisticated prompt editing… will be a significant fact in their favour.

>>manhol+(OP)
If I take a million copywritten images from magazines, cut them with scissors, and make a single collage, I would expect the resulting image to be fair use. Fair use is an affirmative defense, like self defense, where you justify your infringement.

People are treating this like its a binary technical decision. Either it is or isn't a violation. Reality is that things are spectrums and judges judge. SD will likely be treated like a remix that sampled copywritten work, but just a tiny bit of each work, and sufficiently transformed it to create a new work.

replies(1): >>chongl+mP

>>yazadd+t4
I doubt it would be a violation.

Specifically fair use #3 "the amount and substantiality of the portion used in relation to the copyrighted work as a whole."

A sentence being a copyright violation would make every book review in the world illegal.

>>basch+eE
If I take a million copywritten images from magazines, cut them with scissors, and make a single collage, I would expect the resulting image to be fair use.

That’s not how it works. Your collage would be fine if it was the only one since you used magazines you bought. Where you’d get into trouble is if you started printing copies of your collage and distributing them. In that case you’d be producing derived works and be on the hook for paying for licenses from the original authors.

replies(1): >>basch+mV2

>>realus+w3
The difference is that computers create perfect copies of images by default, people don't.

If a person creates a perfect copy of something it shows they have put thousands of hours of practice into training their skills and maybe dozens or even hundreds of hours into the replica.

When a computer generates a replica of something it's what it was designed to do. AI art is trying to replicate the human process, but it will always have the stink of "the computer could do this perfectly but we are telling it not to right now"

Take Chess as an example. We have Chess engines that can beat even the best human Chess players very consistently.

But we also have Chess engines designed to play against beginners, or at all levels of Chess play really.

We still have Human-only tournaments. Why? Why not allow a Chess Engine set to perform like a Grandmaster to compete in tournaments?

Because there would always be the suspicion that if it wins, it's because it cheated to play at above it's level when it needed to. Because that's always an option for a computer, to behave like a computer does.

replies(2): >>derang+yX >>smegge+M71

>>manhol+(OP)
If I make software that randomly draws pixels on the screen then we can say for a fact that no copyrighted images were used.

If that software happens to output an image that is in violation of copyright then it is not the fault of the model. Also, if you ran this software in your home and did nothing with the image, then there's no violation of copyright either. It only becomes an issue when you choose to publish the image.

The key part of copyright is when someone publishes an image as their own. That they copy an image doesn't matter at all. It's what they DO with the image that matters!

The courts will most likely make a similar distinction between the model, the outputs of the model, and when an individual publishes the outputs of the model. This would be that the copyright violation occurs when an individual publishes an image.

Now, if tools like Stable Diffusion are constantly putting users at risk of unknowingly violating copyrights then this tool becomes less appealing. In this case it would make commercial sense to help users know when they are in violation of copyright. It would also make sense to update our copyright catalogues to facilitate these kinds of fingerprints.

>>bluefi+nP
You’re acting like the “computer” has a will of it’s own. Generating a perfect copy of an image would be a completely separate task from training a model for image generation.

There are no models I know of with the ability to generate an exact copy of an image from its training set unless it was solely trained on that image to the point it could. In that case I could argue the model’s purpose was to copy that image rather than learn concepts from a broad variety of images to the point it would be almost impossible to generate an exact copy.

I think a lot of the arguments revolving around AI image generators could benefit from the constituent parties reading up on how transformers work. It would at least make the criticisms more pointed and relevant, unlike the criticisms drawn in the linked article.

replies(1): >>bluefi+F31

>>IncRnd+L2
You've made some errors in reasoning.

First, there is a legal definition of a "derivative work" and there is an artistic notion of a "derivative work". If the two of us both draw a picture of the Statue of Liberty, artistically we have both derived the drawing based on the original statue. However, neither of these drawings in relation to the original sculpture nor the other drawing is legally considered a derivative work.

Let's think about a cartoonish caricature of Joe Biden. What "makes up" Joe Biden?

https://www.youtube.com/watch?v=QRu0lUxxVF4

To what extent are these "constituent parts" present in every image of Joe Biden? All of them? Is the latent space not something that is instead hidden in all images of Joe Biden? Can an image of Joe Biden be made by anyone that is not derived from these "high order" characteristics of what is recognizable as Joe Biden across a number of different renderings from disparate individuals?

replies(1): >>IncRnd+sD1

>>derang+yX
> There are no models I know of with the ability to generate an exact copy of an image from its training set

Is it "the model cannot possibly recreate an image from its training set perfectly" or is it "the model is extremely unlikely to recreate an image from its training set perfectly, but it could in theory"?

Because I am willing to bet it's the latter.

> You’re acting like the “computer” has a will of it’s own. Generating a perfect copy of an image would be a completely separate task from training a model for image generation.

Not my intent, of course I don't think computers have a will of their own. What I meant, obviously, is that it's always possible for a bad actor of a human to make the computer behave in a way that is detrimental to other humans and then justify it by saying "the computer did it, all I did is train the model".

replies(1): >>mlsu+rb2

>>bluefi+nP
>The difference is that computers create perfect copies of images by default

are we looking at the output of the same program? because all of the output images i look at have eyes looking in different direction and things of horror in place of hands or ears, and they feature glasses meting into people faces, and that's the good ones, the bad one have multiple arms contorting out of odd places while bent at unnatural angles.

replies(1): >>bluefi+Cb1

>>manhol+(OP)
how is that any different from new human artist that study other artists work to learn a style or technique. In fact it used to be that the preferred way for painters to learn was to repeatedly copy paintings of masters.

replies(1): >>manhol+3w2

>>smegge+M71
Storing and retrieving photos, files, music, exactly identical to how they were before, is what computers do.

Save a photo on your computer, open it in a browser or photo viewer, you will get that photo. That is the default behavior of computers. That is not in dispute, is it?

All of this machine learning stuff is trying to get them to not do that. To actually create something new that no one actually stored on them.

Hope that clears up the misunderstanding.

>>willia+A11
I can draw Biden, yes, but SD can only draw Biden by deriving it's output from the images on which it was trained. This is a simple tautology, because SD cannot draw Biden without having been trained on that data.

SD both creates derivative works and also sometimes creates pixel level copies from portions of the trained data.

replies(2): >>willia+HF1 >>bobbru+QT1

>>IncRnd+sD1
Yes, and we are now using the artistic definition of “derived” and not the legal definition.

You cannot copyright “any image that resembles Joe Biden”.

replies(1): >>IncRnd+6N3

>>manhol+(OP)
This argument's pedantic and problematic for artists; take away a human's "dataset" and processes and they are also unable to produce a single original "pixel".

>>IncRnd+sD1
Can you draw Biden without ever having seen him or a picture of him? So,why is it that you are not deriving but SD is?

>>manhol+(OP)
That is not true. The dataset is needed, the same way that examples are used by a person learning to draw. But the dataset alone is not capable of producing images not derived from any part of it (and there are many examples of SD results that seem so far to be wholly original), so you can’t reduce stable diffusion to being only derived from the dataset. It may “remember” and generate parts of images in the dataset - but that is a bug, not a feature. With enough prompt tweaking, it may even generate a fairly good copy of pre-existing work - which was what the prompt requested, so responsibility should lie on the prompt writer, not on SD.

But the fact that it often generates new content, that didn’t exist before, or at least doesn’t breach the limits of fair use, goes against the argument made in the lawsuit.

replies(1): >>manhol+Ov2

>>manhol+(OP)
So, is any sort of creation that relies upon copyrighted or patented works copyright infringement? Is any academic research or art that references brands or other creations illegal? This is such a clear case of fair use that it could be a textbook example.

>>IncRnd+L2
So is your mental image of Joe Biden, unless you know him personally.

>>bluefi+F31
In theory, you can:

- Open Microsoft Paint

- Make a blank 400 x 400 image

- Select a pixel and input an R,G,B value

- Repeat the last two steps

To reproduce a copyrighted work. I'm sure people have done this with e.g. pixel art images of copyrighted IP of Mario or Link. At 400x400, it would take 160,000 pixels to do this. At 1 second per pixel, a human being could do this in about a week.

Because people have the capability of doing this, and in fact we have proof that people have done so using tools such as MS paint, AND because it is unlikely but possible that someone could reproduce protected IP using such a method, should we ban Microsoft Paint, or the paint tool, or the ability to input raw RGB inputs?

>>bobbru+3V1
The model can generate original images, yes, and those images might be fair use. But it can also generate near verbatim copies of the source works or substantial parts thereof, so the model itself is not fair use, it's a wholly derivative work.

For example, if a publish a music remix tool with a massive database of existing music, creators might use to create collages that are original and fall under fair use. But the tool itself is not and requires permission from the rights owners.

>>smegge+491
What you and many other in the thread seem to be oblivious about is that algorithms are not people. Yes, it may come as a shock to autistic engineers, but the fact that a machine can do something to what a person does does not warant it equal protection under the law.

Copyright, and laws in general, exists to protect the human members of society not some abstract representation of them.

replies(1): >>weknow+Mm4

>>chongl+mP
That’s not how fair use works. It’s not a binary switch where commercial derivatives automatically require licensing. Such a college would be ruled transformative and non competitive.

Me having bought the magazines also has nothing to do with anything. Would apply equally if they were gifted or free or stolen.

>>willia+HF1
This isn't about what can be copyrighted but that there are copyrighted images being used without following the legal requirements.

>>manhol+3w2
It seems like you're using "autistic" as an insult here. If that's not your intention you might want to edit this comment to use different verbage.

replies(1): >>manhol+485

>>weknow+Mm4
What do you mean, autism is well established as a personality trait that diminishes empathy and the ability to understand other people's desires and emotions, while having a strong affinity to things, for example machines and algorithms.

Legislation is driven by people who are, on aggregate, not autistic. So it's entirely appropriate to presume that a person not understanding how that process works is indeed autistic, especially if they suggest machines are subjects of law by analogy with human beings.

It's not that autists are bad people, they are just outliers in the political spectrum, as you can see from the complete disconnect of up-voted AI-related comments on Hacker News, where autistic engineers are clearly over-represented, versus just about any venue where other professionals, such as painters or musicians, congregate. Just try to suggest to them that a corporation has the right to use their work for free and profit from it while leaving them unemployed, because the algorithm the corporation uses to exploit them is in some abstract sense similar to how their brain works. That position is so for out on the spectrum that presuming a personality peculiarity of the emitter is the absolutely most charitable interpretation.