This is OK and fair use: Training LLMs on copyrighted work, since it's transformative.
This is not OK and not fair use: pirating data, or creating a big repository of pirated data that isn't necessarily for AI training.
Overall seems like a pretty reasonable ruling?
If you train a LLM on harry potter and ask it to generate a story that isn't harry potter then it's not a replacement.
However, if you train a model on stock imagery and use it to generate stock imagery then I think you'll run into an issue from the Warhol case.
So if I or an LLM simply doesn’t allow said extraction to occur, memorization and copying is not against the law.