zlacker

[parent] [thread] 1 comments
1. GaggiX+(OP)[view] [source] 2024-02-01 13:56:48
>I honestly have no idea at all what the OP has found or what it means, but it doesnt seem that surprising that modifying the latent results in global changes in the output.

It only happens in one specific spot: https://i.imgur.com/8DSJYPP.png and https://i.imgur.com/WJsWG78.png. The fact that a single spot in the latent has such a huge impact on the whole image is not a good thing, because the diffusion model will treat that area as equal to the rest of the latent, without giving it more importance. The loss of the diffusion model is applied at the latent level, not the pixel level, so that you don't have to propagate the gradient of the VAE decoder during the training of the diffusion model, so it's unaware of the importance of that spot in the resulting image.

replies(1): >>wokwok+83
2. wokwok+83[view] [source] 2024-02-01 14:13:50
>>GaggiX+(OP)
Not arguing that; I'm just saying I don't know that KL divergence does or is responsible for this, and I haven't seen any compelling argument that increasing the KL term would fix it.

There's no question the OP found a legit issue. The questions are more like:

1) What caused it?

2) How do you fix it?

3) What result would fixing it actually have?

[go to top]