Why bother using a product from a company that is notorious for failing to commit to most of their services, when you can run something which produces output that is pretty close (and maybe better) and is free to run and change and train?
This makes the image much more usable without editing.
I guess that turns out to be not as important for end users as you'd think.
Anyway, DeepFloyd/IF has great comprehension. It is straightforward to improve that for Stable Diffusion, I cannot tell you exactly why they haven't tried this.