> If you scraped New York times for your own LLM that you used internally and didn't distribute the results, there would be no copyright infringement.
Why?
As far as I understand, the copyright owner has control of all copying, regardless of whether it is done internally or externally. Distributing it externally would be a more serious vilation, though.