Now, as for training "AI" models, who knows. You can argue it is the same thing a human is doing or you could argue it a new, different quality and should be under different rules. Regardless, the current copyright laws were written before "AI" models were in widespread use so whatever is allowed or not is more of a historic accident.
So the discussion needs to be about the intention of copyright laws and what SHOULD be.
Copying a work itself can be copyright infringement if it’s very close to the original to the point people may think they’re the same work.
We might be able to argue that the computer program taking art as input and automatically generating art as output is the exact same as an artist some time after general intelligence is reached, until then, it's still a machine transformation and should be treated as such.
AI shouldn't be a legal avenue for copyright laundering.
And practically speaking, putting aside whether a government should even be able to legislate such things, enforcing such a law would be near impossible without wild privacy violations.
1) the artist is not literally copying the copyrighted pixel data into their "system" for training
2) An individual artist is not a multi billion dollar company with a computer system that spits out art rapidly using copyrighted pixel data. A categorical difference.
> automatically generating art as output
The user is navigating the latent space to obtain said output, I don't know if that's transformative or not, but it is an important distinction
If the program were wholy automated as in it had a random number/words generator added to it and no navigation of the latent space by users happened, then yeah I would agree, but that's not the case at least so far as ml algos like midjourney or stable diffusion are concerned
On 1, human artists are copying copyrighted pixel data into their system for training. That system is the brain. It's organic RAM.
On 2, money shouldn't make a difference. Jim Carrey should still be allowed to paint even though he's rich.
If Jim uses Photoshop instead of brushes, he can spit out the style ideas he's copied and transformed in his brain more rapidly - but he should still be allowed to do it.
(That's as opposed to a large language model, which does memorize text.)
Also, you can train it to imitate an artist's style just by showing it textual descriptions of the style. It doesn't have to see any images.
No, it would just legislate what images are and which ones are not on the training data to be parsed, artists want a copyright which makes their images unusable for machine learning derivative works.
The trick here is that eventually the algorithms will get good enough that it won't be necessary for said images to even be on the training data in the first place, but we can imagine that artists would be OK with that
They shouldn't be OK with that and they probably aren't. That's a much worse problem for them!
The reason they're complaining about copyright is most likely coping because this is what they're actually concerned about.
You can however disallow Google from indexing your content using robots.txt a met tag in the HTML or an HTTP header.
Or you can ask Google to remove it from their indexes.
Your content will disappear from then on.
You can't un-train what's already been trained.
You can't disallow scraping for training.
The damage is already done and it's irreversible.
It's like trying to unbomb Hiroshima.
Going painting > raw photo (derivative work), raw photo > jpg (derivative work), jpg > model (derivative work), model > image (derivative work). At best you can make a fair use argument at that last step, but that falls apart if the resulting images harm the market for the original work.
you have rights.
AIs don't.
Because they don't have will.
It's like arresting a gun for killing people.
So, as a human, the individual(s) training the AI or using the AI to reproduce copyrighted material, are responsible for the copyright infringement, unless explicitly authorized by the author(s).
It's quite possible to apply the same kind of protections to generative models. (I hope this does not happen, but it is fully possible.)
A tool that catalogues attributed links can't really be evaluated the same way as pastiche machine.
You'd be much closer using the example of Google's first page answer snippets, that are pulled out of a site's content with minimal attribution.
That might be a good way to go about it
They probably aren't doing that. Studying the production methods and WIPs is more useful for a human. (ML models basically guess how to make images until they produce one that "looks like" something you show it.)
If you have views on whether they'll win, the prediction market is currently at 49%: https://manifold.markets/JeffKaufman/will-the-github-copilot...
Automated transformation is not guaranteed to remove the original copyright, and for simple transformations it won't, but it's an open question (no legal precedent, different lawyers interpreting the law differently) whether what these models are doing is so transformative that their output (when used normally, not trying to reproduce a specific input image) passes the fair use criteria.
Mind you, this is not talking about the usage rights of images generated from such a model, that's a completely different story and a legal one.
hear hear...
> Passively training a model on an artwork does not change the art in the slightest
copyright holders, I mean individual authors, people who actually produced the content being used, disagree.
They say AI is like a bulldozer destroying the park to them.
Which technically is true, it's a machine that someone (some interested party maybe?) is trying to disguise as a human, doing human stuff.
But it's not.
> passive, non-destructive
Passive, non-destructive, in this context means
- passive: people send the images to you, you don't go looking for them
- non-destructive: people authorized you, otherwise it's destructive of their rights.
Can probably do all that well-enough (probably doesn't need to be perfect) by leaning on FAANG, with or without legislation.
But: opt-in by default, or opt-out by default?
But currently, first, there is a reasonable argument that the model weights may be not copyrightable at all - it doesn't really fit the criteria of what copyright law protects, no creativity was used in making them, etc, in which case it can't be a derivative work and is effectively outside the scope of copyright law. Second, there is a reasonable argument that the model is a collection of facts about copyrighted works, equivalent to early (pre-computer) statistical ngram language models of copyrighted books used in e.g. lexicography - for which we have solid old legal precedent that creating such models are not derivative works (again, as a collection of facts isn't copyrightable) and thus can be done against the wishes of the authors.
Fair use criteria comes into play as conditions when it is permissible to violate the exclusive rights of the authors. However, if the model is not legally considered a derivative work according to copyright law criteria, then fair use conditions don't matter because in that case copyright law does not assert that making them is somehow restricted.
Note that in this case the resulting image might still be considered derivative work of an original image, even if the "tool-in-the-middle" is not derivative work.
Say it with me: Computer algorithms are NOT people. They should NOT have the same rights as people.
And the weights. The weights it has learned come originally from the images.
Also, a jpg seemingly fits your definition as “no creativity was used in making them, etc” but clearly they embody the original works creativity. Similarly, a model can’t be trained on random data it needs to extract information from it’s training data to be useful.
The specific choice of algorithm used to extract information doesn’t change if something is derivative.
No they won't. If AI art was just as good as it is today, but didn't use copyrighted images in the training set, people would absolutely still be finding some other thing to complain about.
Artists just don't want the tech to exist entirely.