Llama.ttf: A font which is also an LLM

>>fugled+(OP)
Very much inspired this earlier HackerNews post which put Tetris into a font, today we put an LLM and an inference engine into a font so you can chat with your font, or write stuff with your font without having to write stuff with your font.

>>40737961

>>xrd+Vd
This reminds me of Posy. His channel is so fun, weird, and captivating. https://youtube.com/@posymusic

>>hsfzxj+pc
https://github.com/fuglede/llama.ttf

>>wiradi+Ef
Since it only alters the presentation of the text, not the text/data itself, maybe using a type of image-to-text tool like this could work: https://www.imagetotext.info/

I guess that’s the closest you get to copying.

>>binwie+zk
That's LLaMa-3-70B. The demo he gives at 6:09 is tinystories-15m, which is 30.4MB, so you'd only have to add the font to that (80~KB?)

https://huggingface.co/nickypro/tinyllama-15M/tree/main

>>xrd+Vd
On the esoteric software engineering side, Tom7 is the channel you're looking for! https://www.youtube.com/@tom7

>>xg15+Qh
Considering the actual complexity of rendering e.g. Urdu in decent, native-looking way you presumably do want some Turing-complete capabilities at least in some cases, cf "One handwritten Urdu newspaper, The Musalman, is still published daily in Chennai.[232] InPage, a widely used desktop publishing tool for Urdu, has over 20,000 ligatures in its Nastaʿliq computer fonts." (https://en.wikipedia.org/wiki/Urdu#Writing_system)

Edit—the OP uses this exact use case, Urdu typesetting, to justify WASM in Harfbuzz (video around 6:00); seems like Urdu has really become the posterchild for typographic complexity these days

>>electr+4e
(Sadly) this is nothing new. Years ago I wrangled a (modified) bug in the font rendering of Firefox [1, 2016] into an exploit (for a research paper). Short version: the Graphite2 font rendering engine in FF had/has? a stack machine that can be used to execute simple programs during font rendering. It sounded insane to me back then, but I dug into it a bit. Turns out while rendering Roman based scripts is relatively straightforward [2], there are scripts that need heavy use of ligatures etc. to reproduce correctly [3]. Using a basic scripting (heh) engine for that does make some sense.

Whether this is good or bad, I have no opinion on. It is "just" another layer of complexity and attack surface at this point. We have programmable shaders, rowhammer, speculative execution bugs, data timing side channels, kernel level BPF scripting, prompt injection and much more. Throwing WASM based font rendering into the mix is just balancing more on top of the pile. After some years in the IT security area, I think there are so many easier ways to compromise systems than these arcane approaches. Grab the data you need from a public AWS bucket or social engineer your access, far easier and cheaper.

For what it's worth, I think embedded WASM is a better idea than rolling your own eco systems for scripting capabilities.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1248876

[2] I know, there are so many edge cases. I put this in the same do not touch bucket as time and names.

[3] https://scripts.sil.org/cms/scripts/page.php?id=cmplxrndexam...

>>simonw+Ui
Your comment reminded me of this great talk [1] (humor ofc). While it talks about asm.js, WASM is in may ways, IMO, the continuation of asm.js

[1] https://www.destroyallsoftware.com/talks/the-birth-and-death...

>>xrd+Vd
Adult Swim's Off the Air

https://www.adultswim.com/videos/off-the-air

or on Youtube https://www.youtube.com/playlist?list=PLQl8zBB7bPvLWfGCVicg_...

>>tcsenp+BJ
Another font connoisseur put together a script here that might be helpful: https://github.com/hsfzxjy/Harfbuzz-WASM-Fantasy/blob/master...

>>erk__+7H
And thank goodness it’s disabled, or we could have another JBIG2 https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...

>>fugled+(OP)
The page links to https://www.coderelay.io/fontemon.html which is a game embedded into a font. Playable in the browser.

>>fugled+(OP)
After your help and troubleshooting, I am happy to notify you that your work has been archived (https://archive.tunnelsenpai.win/archive/1719179042.512455/i... and in the Internet Archive). Thanks!

>>btown+Kn
Also Paralogical https://youtube.com/@paralogical-dev

>>Xlythe+4j
"animated fonts" - not really; all meaningful applications not only calculate shaping once, they also aggressively cache the result (mentioned in https://robert.ocallahan.org/2024/06/browser-engine.html)

But things like this might be possible (for now): https://gwern.net/dropcap

>>yjftsj+sw1
> If you're on Windows you should try the powertools OCR tool.

Which is open source (MIT-licensed), the source code is here: https://github.com/microsoft/PowerToys/tree/main/src/modules...

It is written in C#, and uses the Windows.Media.Ocr UWP API to do the actual OCR part: https://learn.microsoft.com/en-us/uwp/api/windows.media.ocr?... – so if your app runs on Windows it can potentially call the same API and get OCR for free

Apple provides OCR through VisionKit ImageAnalyzer API – https://developer.apple.com/documentation/visionkit/imageana... – albeit that is only officially supported to call from Swift (although apparently you can expose it to Objective C if your write a "proxy Swift framework"–a custom Swift framework that wraps the original and adds @objc everywhere–I assume such a proxy framework could be autogenerated using reflection, but I'm not sure if anyone has written a tool that actually does that). There is also the older VNRecognizeTextRequest API which is supported by Objective C, but its OCR quality is inferior.

I'm not sure what the best answer for Linux or Android is. I guess https://github.com/tesseract-ocr/tesseract ?

>>xrd+Vd
Depending on your kind of weird this might interest you This Exists on youtube https://youtube.com/@thisexists/videos

>>UncleO+6X2
https://webassembly.org/docs/security/

zlacker

Llama.ttf: A font which is also an LLM