> Alsup ruled that Anthropic's use of copyrighted books to train its AI models was "exceedingly transformative" and qualified as fair use
> "All Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies"
It was always somewhat obvious that pirating a library would be copyright infringement. The interesting findings here are that scanning and digitizing a library for internal use is OK, and using it to train models is fair use.
> But Alsup drew a firm line when it came to piracy.
> "Anthropic had no entitlement to use pirated copies for its central library," Alsup wrote. "Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic's piracy."
That is, he ruled that
- buying, physically cutting up, physically digitizing books, and using them for training is fair use
- pirating the books for their digital library is not fair use.
Found it: https://www.nbcnews.com/tech/tech-news/federal-judge-rules-c...
> “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft,” [Judge] Alsup wrote, “but it may affect the extent of statutory damages.”
No it's not. And you ever heard of a publishing house? They don't need to negotiate with every single author individually. That's preposterous.
It's not the only reason fair use exists, but it's the thing that allows e.g. search engines to exist, and that seems pretty important.
> And you ever heard of a publishing house? They don't need to negotiate with every single author individually. That's preposterous.
There are thousands of publishing houses and millions of self-published authors on top of that. Many books are also out of print or have unclear rights ownership.
No, it kinda isn't. Show me anything that supports this idea beyond your own immediate conjecture right now.
>It's not the only reason fair use exists, but it's the thing that allows e.g. search engines to exist, and that seems pretty important.
No, that's the transformative element of what a search engine provides. Search engines are not legal because they can't contact each licensor, they are legal because they are considered hugely transformative features.
>There are thousands of publishing houses and millions of self-published authors on top of that. Many books are also out of print or have unclear rights ownership.
Okay, and? How many customers does Microsoft bill on a monthly basis?
It's inherent in the nature of the test. The most important fair use factor is the effect on the market for the work, so if the use would be uneconomical without fair use then the effect on the market is negligible because the alternative would be that the use doesn't happen rather than that the author gets paid for it.
> No, that's the transformative element of what a search engine provides. Search engines are not legal because they can't contact each licensor, they are legal because they are considered hugely transformative features.
To make a search engine you have to do two things. One is to download a copy of the whole internet, the other is to create a search index. I'm talking about the first one, you're talking about the second one.
> Okay, and? How many customers does Microsoft bill on a monthly basis?
Microsoft does this with an automated system. There is no single automated system where you can get every book ever written, and separately interfacing with all of the many systems needed in order to do it is the source of the overhead.
No, that's not the most important factor. The transformative factor is the most important. Effect on market for the work doesn't even support your argument anyway. Your argument is about the cost of making the end product, which is totally distinct from the market effects on the copyright holder when the infringer makes and releases the infringing product.
>To make a search engine you have to do two things. One is to download a copy of the whole internet, the other is to create a search index. I'm talking about the first one, you're talking about the second one.
So? That doesn't make you right. Go read the opinions, dude. This isn't something that's actually up for debate. Search engines are fair uses because of their transformative effect, not because they are really expensive otherwise. Your argument doesn't even make sense. By that logic, anything that's expensive becomes a fair use. It's facially ridiculous. Them being expensive is neither sufficient nor necessary for them to be a fair use. Their transformative nature is both sufficient and necessary to be found a fair use. Full stop.
>Microsoft does this with an automated system. There is no single automated system where you can get every book ever written, and separately interfacing with all of the many systems needed in order to do it is the source of the overhead.
Okay, and? They don't need to get every single book ever written. The libraries they pirated do not consist of "every single book ever written". It's hard to take this argument in good faith because you're being so ridiculous.
It's a four factor test because all of the factors are relevant, but if the use has negligible effect on the market for the work then it's pretty hard to get anywhere with the others. For example, for cases like classroom use, even making verbatim copies of the entire work is often still fair use. Buying a separate copy for each student to use for only a few minutes would make that use uneconomical.
> Effect on market for the work doesn't even support your argument anyway. You're argument is about the cost of making the end product, which is totally distinct from the market effects on the copyright holder when the infringer makes and releases the infringing product.
We're talking about the temporary copies they make during training. Those aren't being distributed to anyone else.
> So? That doesn't make you right.
Making a copy of everything on the internet is a prerequisite to making a search engine. It's something you have to do as a step to making the index, which is the transformative step. Are you suggesting that doing the first step is illegal or what do you propose justifies it?
> By that logic, anything that's expensive becomes a fair use. It's facially ridiculous.
Anything with unreasonably high transaction costs. Why is that ridiculous? It doesn't exempt any of the normal stuff like an individual person buying an individual book.
> They don't need to get every single book ever written.
They need to get as many books as possible, with the platonic ideal being every book. Whether or not the ideal is feasible in practice, the question is whether it's socially beneficial to impose a situation with excessively high transaction costs in order to require something with only trivial benefit to authors (potentially selling one extra copy).