This is OK and fair use: Training LLMs on copyrighted work, since it's transformative.
This is not OK and not fair use: pirating data, or creating a big repository of pirated data that isn't necessarily for AI training.
Overall seems like a pretty reasonable ruling?
If you train a LLM on harry potter and ask it to generate a story that isn't harry potter then it's not a replacement.
However, if you train a model on stock imagery and use it to generate stock imagery then I think you'll run into an issue from the Warhol case.
I wouldn't call it that. Goldsmith took a photograph of Prince which Warhol used as a reference to generate an illustration. Vanity Fair then chose to buy a license Warhol's print instead of Goldsmith's photograph.
So, despite the artwork being visual transformative (silkscreen vs photograph) the actual use was not transformed.