Instead they use
>The robin flew from his swinging spray of ivy on to the top of the wall and he opened his beak and sang a loud, lovely trill, merely to show off. Nothing in the world is quite as adorably lovely as a robin when he shows off - and they are nearly always doing it.
And show off the result being a photograph of a robin, cool. SDXL[0] can do the exact same thing given the same prompt, in fact even SD1.5 would be able to easily[1].
"A flying squirrel gliding between trees": It won't be able to do it. Just telling it "flying squirrel" will often generate squirrels with bat wings coming off their backs.
Ahh, but that's just a tiny, specific thing missing from the data set! Surely that'll get fixed eventually as they add more training data...
"A fox girl hugging a bunny girl hugging a cat girl": The only way to make this work is with fancy stuff like Segment Anything (SAM) working with Stable Diffusion. Alternative prompts of the same thing:
"A fox girl and a bunny girl and a cat girl all hugging each other"
It's such a simple thing; generative AI can make three people hugging each other no problem. However, trying to get it to generate three different types of people in the same scene is really, really hard and largely dependent on luck.
Would be a lot easier if AfterDetailer could handle dynamic prompts.