If I train on one image I can get it right back out. Even two, maybe even a thousand? Not sure what the line would be where it becomes ok vs not but there will have to be some answer.
Which is why this is framed as compression, it implies that fundamentally SD makes copies instead of (re)creating art. Leaving out the issue of recreating forgeries of existing works, using the training data for the creation of new pieces should be well covered inside the bounds of appropriation. Demanding anything more then filtering the output of SD for 1:1 reproductions of the training data is really pushing it.
edit: Checksums arent necessarily unique btw. See "Hash collisions".
Regarding your edit, what are the chances of a "hash collision" where the hash is two MP4 files for two different movies? Seems wildly astronomical.. impossible even? That's why this hash method is so special, plus the built in preview feature you can use to validate your hash against the source material, even without access to the original.
Pretty sure this is nitpicking about an overused analogy though.
So it would be quite easy to make a trademark laundering operation, in theory.