Frinkiac – 3M "The Simpsons" Screencaps

>>GlumWo+(OP)
Seems like the search is based only on the transcript/dialogue - not an image embedding. Would be super cool to actually use some CLIP/embedding search on these for a more effective fuzzy lookup.

>>Zee2+Oqb
How would someone go about doing this, just curious?

>>adzm+nxb
You’d just run every picture through CLIP, essentially you run an image generator backwards. Instead of text to image like most end users use when using something like stable diffusion (been awhile since I’ve done this), it can do the exact opposite and generate tokens (just words in this case) to describe the input image.

I’d guess famous characters like Bart and Marge and other Simpsons characters would likely be known by the tokenizer so it’d be pretty easy. So then you’d be able to guess.

Feel free to correct me on small details if anyone has this more fresh in their mind but I’m roughly correct here.

zlacker