> Alsup ruled that Anthropic's use of copyrighted books to train its AI models was "exceedingly transformative" and qualified as fair use
> "All Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies"
It was always somewhat obvious that pirating a library would be copyright infringement. The interesting findings here are that scanning and digitizing a library for internal use is OK, and using it to train models is fair use.
> But Alsup drew a firm line when it came to piracy.
> "Anthropic had no entitlement to use pirated copies for its central library," Alsup wrote. "Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic's piracy."
That is, he ruled that
- buying, physically cutting up, physically digitizing books, and using them for training is fair use
- pirating the books for their digital library is not fair use.
Found it: https://www.nbcnews.com/tech/tech-news/federal-judge-rules-c...
> “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft,” [Judge] Alsup wrote, “but it may affect the extent of statutory damages.”
No it's not. And you ever heard of a publishing house? They don't need to negotiate with every single author individually. That's preposterous.
It's not the only reason fair use exists, but it's the thing that allows e.g. search engines to exist, and that seems pretty important.
> And you ever heard of a publishing house? They don't need to negotiate with every single author individually. That's preposterous.
There are thousands of publishing houses and millions of self-published authors on top of that. Many books are also out of print or have unclear rights ownership.
No, it kinda isn't. Show me anything that supports this idea beyond your own immediate conjecture right now.
>It's not the only reason fair use exists, but it's the thing that allows e.g. search engines to exist, and that seems pretty important.
No, that's the transformative element of what a search engine provides. Search engines are not legal because they can't contact each licensor, they are legal because they are considered hugely transformative features.
>There are thousands of publishing houses and millions of self-published authors on top of that. Many books are also out of print or have unclear rights ownership.
Okay, and? How many customers does Microsoft bill on a monthly basis?
It's inherent in the nature of the test. The most important fair use factor is the effect on the market for the work, so if the use would be uneconomical without fair use then the effect on the market is negligible because the alternative would be that the use doesn't happen rather than that the author gets paid for it.
> No, that's the transformative element of what a search engine provides. Search engines are not legal because they can't contact each licensor, they are legal because they are considered hugely transformative features.
To make a search engine you have to do two things. One is to download a copy of the whole internet, the other is to create a search index. I'm talking about the first one, you're talking about the second one.
> Okay, and? How many customers does Microsoft bill on a monthly basis?
Microsoft does this with an automated system. There is no single automated system where you can get every book ever written, and separately interfacing with all of the many systems needed in order to do it is the source of the overhead.