zlacker

[return to "Google Imagen 2"]
1. boh+VY[view] [source] 2023-12-13 18:45:48
>>geox+(OP)
I think the competition for text to image services is over and open source, stable diffusion won. It doesn't matter how detailed (or whatever counts as "better") corporate text-to-image products get, stable diffusion is good enough which really is good enough. Unlike the corporate offerings, open source txt2img doesn't have random restrictions (no its not just porn at this point) and actually allows for additional scripts/tooling/models. If you're attempting to do anything on a professional level or produce an image with specific details via txt2img, you likely have a workflow with txt2img being only step one.

Why bother using a product from a company that is notorious for failing to commit to most of their services, when you can run something which produces output that is pretty close (and maybe better) and is free to run and change and train?

◧◩
2. karmas+C81[view] [source] 2023-12-13 19:22:42
>>boh+VY
Why stable diffusion won? Dalle3 and this is miles ahead in understanding scene and put correct text at the right place.

This makes the image much more usable without editing.

◧◩◪
3. simonw+Ld1[view] [source] 2023-12-13 19:39:23
>>karmas+C81
DALL-E 3 doesn't have Stable Diffusion's killer feature, which is the ability to use an image as input and influence that image with the prompt.

(DALL-E pretends to do that, but it's actually just using GPT-4 Vision to create a description of the image and then prompting based on that.)

Live editing tools like https://drawfast.tldraw.com/ are increasingly being built on top of Stable Diffusion, and are far and away the most interesting way to interact with image generation models. You can't build that on DALL-E 3.

◧◩◪◨
4. karmas+wn1[view] [source] 2023-12-13 20:28:10
>>simonw+Ld1
Saying SD is losing or not useful isn't my position.

But it clearly didn't win in many scenarios, especially those require text to be precise, and that happens to be more important in commercial setting, to clear up those gibberish texts generated by OSS stable diffusion seems tiring by itself.

[go to top]