zlacker

Can't the same argument be used to say "lossy compression is not plagiarism".

If I encode a movie with H264 there is no way to get it to output "exactly what was in the training data" and I can argue that "like humans extract important information from large dumps of data, the algorithm does the same".

I don't have any reservations about calling an H264 encoded video redistributed with the wrong attribution "plagiarism", so I don't see what's different about Large X Models that they deserve a special pass.