zlacker

[parent] [thread] 2 comments

Imagen takes text embeddings, OpenAI model takes image embeddings instead, this is the reason. There are other models that can generate text: latent diffusion trained on LAION-400M, GLIDE, DALL-E (1).

replies(1): >>ALittl+6b

>>GaggiX+(OP)
My understanding of the terms text and image embeddings is that they are ways of representing text or images as vectors. But, I don't understand how that would help with the process of actually drawing the symbols for those letters.

replies(1): >>GaggiX+Az

>>ALittl+6b
If the model takes text embeddings/tokens as an input, it can create a connection between the caption and the text on the image (sometimes they are really similar).

[go to top]