zlacker

[parent] [thread] 0 comments
1. eloisi+(OP)[view] [source] 2025-12-06 07:25:57
I doubt that’s what’s happening too but it’s not beyond the pale. They could be feeding both the input video and audio/transcript into their transformer and it has learned “when the audio is talking about lips the person is usually puckering their lips for the camera” so it regurgitates that.
[go to top]