This is OK and fair use: Training LLMs on copyrighted work, since it's transformative.
This is not OK and not fair use: pirating data, or creating a big repository of pirated data that isn't necessarily for AI training.
Overall seems like a pretty reasonable ruling?
I tend to think copyright should be extremely limited compared to what it is now, but to me the logic of this ruling is illogical other than "it's ok for a corporation to use lots of works without permission but not for an individual to use a single work without permission." Maybe if they suddenly loosened copyright enforcement for everyone I might feel differently.
"Kill one man, and you are a murderer. Kill millions of men, and you are a conqueror." (An admittedly hyperbolic comparison, but similar idea.)
I'm allowed to hear a copyrighted tune, and even whistle it later for my own enjoyment, but I can't perform it for others without license.
People need to stop anthropomorphizing neural networks. It's a software and a software is a tool and a tool is used by a human.
What makes far more sense is saying that someone, a human being, took copyrighted data and fed it into a program that produces variations of the data it was fed. This is no different from a photoshop filter, and nobody would ever need to argue in court that a photoshop filter is not a human being.