zlacker

[parent] [thread] 1 comments
1. ants_e+(OP)[view] [source] 2024-02-01 14:02:18
> 64x64x4

I'm curious why 4? Is this just what works in practice, or do the 4 channels have known interpretations?

replies(1): >>Sharli+T5
2. Sharli+T5[view] [source] 2024-02-01 14:32:59
>>ants_e+(OP)
I'm not sure why 4 was chosen (maybe just because it's a power of two?) but a while ago a SD user found that RGB can be approximated fairly well with a simple linear combination of the latents: https://discuss.huggingface.co/t/decoding-latents-to-rgb-wit...
[go to top]