Which is why this is framed as compression, it implies that fundamentally SD makes copies instead of (re)creating art. Leaving out the issue of recreating forgeries of existing works, using the training data for the creation of new pieces should be well covered inside the bounds of appropriation. Demanding anything more then filtering the output of SD for 1:1 reproductions of the training data is really pushing it.
edit: Checksums arent necessarily unique btw. See "Hash collisions".
Regarding your edit, what are the chances of a "hash collision" where the hash is two MP4 files for two different movies? Seems wildly astronomical.. impossible even? That's why this hash method is so special, plus the built in preview feature you can use to validate your hash against the source material, even without access to the original.
Pretty sure this is nitpicking about an overused analogy though.