zlacker

[parent] [thread] 12 comments
1. Boppre+(OP)[view] [source] 2023-09-29 23:52:17
That's exactly why I've been keeping all "duplicates" in my photo collections.

They do take up a lot of space, and just today I asked in photo.stackexchange for backup compression techniques that can exploit inter-image similarities: https://photo.stackexchange.com/questions/132609/backup-comp...

replies(4): >>syntax+i7 >>randyr+6e >>bick_n+iq >>RockRo+jq
2. syntax+i7[view] [source] 2023-09-30 01:09:08
>>Boppre+(OP)
Suggestion: stack the images vertically or horizontally. Frequency spectrum compression schemes like JPG will see the similarity in the fine details.
replies(2): >>bayesi+Aj >>bondar+201
3. randyr+6e[view] [source] 2023-09-30 02:44:03
>>Boppre+(OP)
most duplicates are from the same vantage point. these are not. i.e. you don't need to keep them all.
replies(1): >>beagle+xv
◧◩
4. bayesi+Aj[view] [source] [discussion] 2023-09-30 04:09:09
>>syntax+i7
I got really good compression using this technique with JPEG XL, I'm sure there's even a good reason why it works so well but it's been a long time and I don't seem to remember why.
5. bick_n+iq[view] [source] 2023-09-30 06:06:10
>>Boppre+(OP)
Tiled/stacked approach as others mention is good, and probably the best approach. Could also try doing an uncompressed format (even just .png uncompressed) or something simple like RLE then 7zip them together since 7zip is the only archive format that does inter-file (as opposed to intra-file) compression as far as I am aware.

Unfortunately lossless video compression won't help here as it will compress frames individually for lossless.

replies(1): >>adrian+Qr
6. RockRo+jq[view] [source] 2023-09-30 06:06:54
>>Boppre+(OP)
Stupid question. Would a block based deduplicating file system solve this?
◧◩
7. adrian+Qr[view] [source] [discussion] 2023-09-30 06:29:05
>>bick_n+iq
Inter file compression has been solved ever since tar|gz
replies(3): >>daniel+is >>tehsau+nu >>beagle+pv
◧◩◪
8. daniel+is[view] [source] [discussion] 2023-09-30 06:39:38
>>adrian+Qr
Not even remotely an efficient scheme for images or video.
◧◩◪
9. tehsau+nu[view] [source] [discussion] 2023-09-30 07:17:15
>>adrian+Qr
That’s for lossless compression, i think there’s special opportunities for multi image lossy
◧◩◪
10. beagle+pv[view] [source] [discussion] 2023-09-30 07:40:37
>>adrian+Qr
Not so. Gzip’s window is very small - 32K in the original gzip iirc, which meant even identical copies of a 33KB file would bot help each other.

Iirc it was Bzip2 that bumped that up to 1MB, and there are now compressors with larger windows - but files have also grown, it’s not a solved problem for compression utilities.

It is solved for backup - but, reatic, and a few others will do that across a backup set with no “window size” limit.

…. And all of that is only true for lossless, which does not include images or video.

◧◩
11. beagle+xv[view] [source] [discussion] 2023-09-30 07:42:41
>>randyr+6e
Those have been used for denouncing and super resolution for 30 years now - they are not useless. And storage is cheap, just keep them all.
replies(1): >>beagle+oI
◧◩◪
12. beagle+oI[view] [source] [discussion] 2023-09-30 10:54:29
>>beagle+xv
That was supposed to be denoising, not denouncing, DYAC. Just noticed, too late to Edit Now.
◧◩
13. bondar+201[view] [source] [discussion] 2023-09-30 13:40:35
>>syntax+i7
>in the fine details

Could it be possible that jpg also exploits the repetition at the wavelength of the width of a single picture, so to say? E.g. 4 pictures side-by-side with the same black dot in the center, can all 4 dots be encoded with a single sine wave (simplifying a lot here..) that has peaks at each dot?

[go to top]