AFAICT this is how it is done (edit: I am wrong, it uses Wasm):
- The frames of the video are simply stored as glyphs in the font
- There is a ligature mapping for sequences of dots to glyphs (for example "." is mapped to glyph 1, ".." is mapped to glyph 2, "..." is mapped to glyph 3, etc.
- If you use the font in an editable part of the browser and hold the "." key pressed, dots get added by autorepeat and a growing a sequence of dots is inserted. This sequence of dots is converted by the font's ligature mapping to different animation frame glyphs, thus showing the animation.
I have no idea why WASM and HarfBuzz are needed (it should work in any modern browser without them), but it looks like a fun little experiment.
I wondered myself about just using "simple" ligatures, but I don't know whether or not it's feasible to statically store several thousand ligature definitions in a font that are each mostly runs of several thousand characters being substituted. But maybe? OpenType has mysterious depths.