This is OK and fair use: Training LLMs on copyrighted work, since it's transformative.
This is not OK and not fair use: pirating data, or creating a big repository of pirated data that isn't necessarily for AI training.
Overall seems like a pretty reasonable ruling?
Even if LLMs were actual human-level AI (they are not - by far), a small bunch of rich people could use them to make enormous amounts of money without putting in the enormous amounts of work humans would have to.
All the while "training" (= precomputing transformations which among other things make plagiarism detection difficult) on work which took enormous amounts of human labor without compensating those workers.