should you get to decide if people can take pictures of your store?
Especially since they're letting stores pay money to be the first recommended store.
Robots.txt exists because shop photographers want to be allowed to take pictures rather than be blocked tout court.
I see this argument made over and over again here on HN and it’s puzzling that people always stop at the first part.
Companies won’t stop at the “look at your content” phase. They will use the knowledge gathered by looking at your content to do something else. That’s the problematic part.
(Edit: How is a factual, on-topic statement, providing a source-link for its claim, downvoted? You may not favor these regulations, but they still do exist.)
Retail companies research what other retail companies are doing and copy them all the time... was the answer supposed to be no here?
I find this debate very aligned to copyright debates.
The value of a store is the ability to buy products from it, you taking a photo of it doesn't impact that transaction of value that at all. The value of content online is the very act of reading it/consuming it.
A scraper is getting a free lunch, that is clear. They are trading nothing for something, and as the owner of the something that is not the deal I have chosen to make.
The business has the right to ask you to leave if you violate their policies. In fact, they can ask you to leave for (almost) any reason at all. They may have some limited right to remove you using a reasonable amount of force, depending on the jurisdiction.
Once you've left or been removed from their property, you still have the legal right to take photos of it from the public place you're now standing in. If you can view the photos or are they're selling through their window, you can keep taking photos of it.
They don't have the right to confiscate your camera or the pictures you took. Your rights in terms of what you can do with those photos may have limitations (e.g. redistribution, reproduction), particularly if you photographed copyrighted works.
This is why the parent's comment confused me so much. In most of the world you live in a society where yeah you have the freedom to take photos of stuff, or copy it down on a clipboard or whatever, and use it as competitive intelligence to improve your own business. And thousands of businesses are doing it every day.
It becomes integral part of a business product. That is the problematic part.
You going into a store and take pictures of some art to use as a reference material is not an issue.
But if you take those pictures and you use them to make a program that than spits out new art that is just a mix of those images patched together then, imo, that’s an issue.
Of course it's OK to take note of what stock is on a store's shelf, go back to your own business, and sell the same stock. It's also ubiquitous. It is de facto practiced globally by everyone, it's generally legal, and it's morally fine. Broadly speaking we call this competitive intelligence or market intelligence.
The source content is part of the AI product. There is no AI product without the source content.
This is not you going to a store and see what they sell and adjust your offering. You have no offering without the original store’s content.
I think it's almost a guarantee that courts will start finding exact AI reproductions of copyrighted work to be infringement.
Where the analogy might come into play is that if you take a photo of a copyrighted work there are limitations on what you can do with your photo, without infringing on that copyright. I have no idea if the courts will apply that stuff to AI, for instance there's actually a fair bit of leeway if you take a photo which contains only a portion of a copyrighted work and then you want to sell or redistribute that photo. One might argue that this legal principle applies to AI as well... lawyers are already having a field day with this stuff I'm sure.
they aren't copying the content. They are learning off the content, and produce more like it but not a copy.
but when people do that, it is allowed isnt it? So what is special about AI, other than the scale?
AI is software, it doesnt “learn” as a human does and even if it did it would still have to be bound by the same rules as any other piece of software and human.
exactly, so there's zero reason to prevent anyone from using a piece of software (which slurps a lot of information off the internet), and produce new works that do not break currently copyrighted content.
That was never not true. The difference is that AI can't violate copyright, only humans can. The legal not-so-gray area is whether "spat out by an AI after prompting" is a performance of the work and if so, what human is responsible for the copying.
The exceptions will be like, pictures of a specific city's skyline. Not because it's copying a particular image, but because that's what that city's skyline looks like, so that's how it looks in an arbitrary picture of it. But those are the pictures that lack original creativity to begin with -- which is why the pictures in the training data are all the same and so is the output.
And people seem to make a lot of the fact that it will often reproduce watermarks, but the reason it does that isn't that it's copying a specific image. It's that there are a large number of images of that subject with that watermark. So even though it's not copying any of them in particular, it's been trained that pictures of that subject tend to have that watermark.
Obviously lawyers are going to have a field day with this, because this is at the center of an existing problem with copyright law. The traditional way you show copying is similarity (and access). Which no longer really means anything because you now have databases of billions of works, which are public (so everyone has access), and computers that can efficiently process them all to find the existing work which is most similar to any new one. And if you put those two works next to each other they're going to look similar to a human because it's the 99.9999999th percentile nearest match from a database of a billion images, regardless of whether the new one was actually generated from the existing one. It's the same reason YouTube Content ID has false positives -- except that its database only includes major Hollywood productions. A large image database would have orders of magnitude more.
Gone will be revenue sharing, gone will be users visiting other sites.
The goal is for Google to keep ALL the revenue, for content written by others.
Hope that works out for them. I have already taken down over 300 articles written on networking, Linux, FreeBSD, Wireguard, DSP, software defined radios. I am not feeding a machine that steals my writing, regardless if I never explicitly expected payment from the viewer.
Nowadays most blog posts in the SERPs are full of spam and unnecessary filler text. I stopped clicking on random blogs because of how awful they’ve become. I’m currently using bing chat (which uses ChatGpt 4 under the hood) and it saves me a lot of time.
> The issue is using copyrighted content without consent
the consent is given implicitly if the content is available to the public for viewing. The copyright isn't being violated by an ai training model, as it isn't copied. The information contained within the works is not what's being copyrighted - it's the expression.
If the ai training algorithm is capable of extracting the information out of the works, and use it in another environment as part of some other works, you cannot claim copyright over such information.
This applies to style, patterns and other abstract information that could be extracted from works. It's as if a chef, upon reading many recipe books, produces a new recipe book (that contains information extracted from them) - the original creators of those recipe books cannot claim said chef had violated any copyright.