zlacker

[return to "Google Imagen 2"]
1. apsec1+H8[view] [source] 2023-12-13 15:39:28
>>geox+(OP)
This would have been an epic release two years ago, but there are now many well-established models in this area (DALL-E, Midjourney, Stable Diffusion). It would be great to see some comparisons or benchmarks to show Imagen 2 is a better alternative. As it stands, it's hard for me to tell if this is worth switching to.
◧◩
2. Mashim+qb[view] [source] 2023-12-13 15:49:52
>>apsec1+H8
> it's hard for me to tell

I can only compare it to Stable Diffusion. But Imagen2 seems significant more advanced.

Try to do anything with text and SDxl. It's not easy and often messes up. I don't think you can get a clean logo with multiple text areas on sdxl.

Look at the prompt and image of the robin. That is mighty impressive.

◧◩◪
3. averev+8d[view] [source] 2023-12-13 15:56:48
>>Mashim+qb
yeah stable diffusion has very limited understanding of composition instructions. you can reliably get things drawn, but it's super hard to get a specific thing in a specific place (i.e "a man with blonde hairs near a girl with black hairs" is gonna assign hair color more or less randomly and there's no guarantee on how many people will be on the picture) - regional prompting and control net somewhat help, but regional prompting is very unreliable and control net is, well, not text to image.

dalle 3 gets things right most of the time

[go to top]