Honestly, I get this feeling about these lawsuits about using content to train LLMs.
Think of it this way: in growing up and learning to read and getting an education you read any number of books, articles, Web pages, magazines, etc. You viewed any number of artworks, buildings, cars, vehicles, furniture, etc, many of which might have design patents. We have such silliness as it being illegal to distribute photos commercially of the Eiffel Tower at night [2].
What's the differnce between training a model on text and images and educating a person with text and images, really? If I read too many NYT articles, am I going to get sued for using too much "training data"?
Currently we need copious quantities of training data for LLMs. I believe this is because we're in the early days of this tech. I mean no person has read millions of articles or books. At some point models will get better with substantially smaller training sets. And then, how many articles is too many as far as these suits go?
[1]: https://en.wikipedia.org/wiki/Wright_brothers_patent_war
[2]: https://www.travelandleisure.com/photography/illegal-to-take...
"Photographing the Eiffel Tower at night is not illegal at all. Any individual can take photos and share them on social networks. But the situation is different for professionals. The Eiffel Tower's lighting and sparkling lights are protected by copyright, so professional use of images of the Eiffel Tower at night requires prior authorization and may be subject to a fee."
I happen to agree on that one. What is the benefit of copyrighting the Eiffel Tower? The purpose of copyright is not to say you can always make money off of what you created. It is to incentivize the creation of new things by allowing you to exclusively make money off of it for a while before its benefits can go to broader society.
So what is the purpose of copyrighting the Eiffel tower? Would it not have been made if copyright wasn't in place? (obviously it would have because it was and the law wasn't in place yet). Second the claim is that the copyright is on the "lighting design" visible at night. Is the lighting design of the tower so unique that no-one else could come up with it? or is the lighting design necessitated by the structure of the tower itself?
I'd say given the structure of the tower which restricts the lights, there is nothing sufficiently remotely unique or different to warrant copyright of the lighting design. Almost any design on that tower would look about the same.
So how is society benefiting from copyrighting that lighting design?
Exclusivity deals are almost always a net loss for society. Which is why whenever you see one you should be questioning if it should be in place. Exclusive contracts are anti free-market. Now there are absolutely valid places where they are justified and should be in place - but they should be questioned by default.