The data must be encoded with various levels of feature abstraction for this stuff to work at all. Much like humans learning art, if devoid of the input that makes human art interesting (life experience).
I think a more promising avenue for litigating AI plagiarism is to identify that the model understands some narrow slice of the solution space that contains copyrighted works, but is much weaker when you try to deviate from it. Then you could argue that the model has probably used that distinct work rather than learned a style or a category.