zlacker

[return to "Imagen, a text-to-image diffusion model"]
1. benwik+L6[view] [source] 2022-05-23 21:29:19
>>kevema+(OP)
Would be fascinated to see the DALL-E output for the same prompts as the ones used in this paper. If you've got DALL-E access and can try a few, please put links as replies!
◧◩
2. qclibr+X9[view] [source] 2022-05-23 21:46:29
>>benwik+L6
See the paper here : https://gweb-research-imagen.appspot.com/paper.pdf Section E : "Comparison to GLIDE and DALL-E 2"
◧◩◪
3. thorum+Gq[view] [source] 2022-05-23 23:40:32
>>qclibr+X9
Imagen seems better at capturing details/nuance from the prompt, but subjectively the DALLE-2 images feel more “real” to me. Not sure why. Something about the lighting?
◧◩◪◨
4. ravi-d+RJ[view] [source] 2022-05-24 02:35:48
>>thorum+Gq
That feels about right. Imagen has a better text processing model, so it can tease apart the prompt, but DALLE has a rocking image part.
[go to top]