zlacker

Everything you wrote ignores the fact that this content taken from websites are not just parked there to be used as “competitive intelligence”

It becomes integral part of a business product. That is the problematic part.

You going into a store and take pictures of some art to use as a reference material is not an issue.

But if you take those pictures and you use them to make a program that than spits out new art that is just a mix of those images patched together then, imo, that’s an issue.

replies(1): >>safety+32

>>manuel+(OP)
It sounds to me like we agree. With respect, people have a lot more rights than they realize when it comes to taking photos of stuff in public (or semi-public) places, which is the scenario in your analogy. But this has questionable bearing on whether an AI can scoop up Internet content and do something with it.

I think it's almost a guarantee that courts will start finding exact AI reproductions of copyrighted work to be infringement.

Where the analogy might come into play is that if you take a photo of a copyrighted work there are limitations on what you can do with your photo, without infringing on that copyright. I have no idea if the courts will apply that stuff to AI, for instance there's actually a fair bit of leeway if you take a photo which contains only a portion of a copyrighted work and then you want to sell or redistribute that photo. One might argue that this legal principle applies to AI as well... lawyers are already having a field day with this stuff I'm sure.

replies(1): >>Spivak+Lz

>>safety+32
> I think it's almost a guarantee that courts will start finding exact AI reproductions of copyrighted work to be infringement.

That was never not true. The difference is that AI can't violate copyright, only humans can. The legal not-so-gray area is whether "spat out by an AI after prompting" is a performance of the work and if so, what human is responsible for the copying.

replies(1): >>Anthon+qQ

>>Spivak+Lz
Except that they almost never do exact reproductions of a work. If you were trying to do it on purpose you'd have to do some significant prompt engineering to get it to even come close. Because the nature of it is to smush together thousands of different things, not photocopy one in particular.

The exceptions will be like, pictures of a specific city's skyline. Not because it's copying a particular image, but because that's what that city's skyline looks like, so that's how it looks in an arbitrary picture of it. But those are the pictures that lack original creativity to begin with -- which is why the pictures in the training data are all the same and so is the output.

And people seem to make a lot of the fact that it will often reproduce watermarks, but the reason it does that isn't that it's copying a specific image. It's that there are a large number of images of that subject with that watermark. So even though it's not copying any of them in particular, it's been trained that pictures of that subject tend to have that watermark.

Obviously lawyers are going to have a field day with this, because this is at the center of an existing problem with copyright law. The traditional way you show copying is similarity (and access). Which no longer really means anything because you now have databases of billions of works, which are public (so everyone has access), and computers that can efficiently process them all to find the existing work which is most similar to any new one. And if you put those two works next to each other they're going to look similar to a human because it's the 99.9999999th percentile nearest match from a database of a billion images, regardless of whether the new one was actually generated from the existing one. It's the same reason YouTube Content ID has false positives -- except that its database only includes major Hollywood productions. A large image database would have orders of magnitude more.