Which programming languages are most token-efficient?

>>tehnub+(OP)
Concatenative languages like Factor and Forth are very token-efficient in theory. Theoretically optimal for raw lexical density. No parentheses, no commas, no argument delimiters, just whitespace-separated words, but stack shuffling can add overhead for complex data flow, unless you use "locals" in Factor, for example.

C is surprisingly efficient as well. Minimal keywords, terse syntax, single-character operators. Not much boilerplate, and the core logic is dense.

I think the worst languages are Java, C#, and Rust (lifetime annotations, verbose generics).

In my opinion, C or Go for imperative code, Factor / Forth if the model knows them well.

>>johnis+RM
Is that statement about C based on anything in particular? C was 18th of all the languages in the article's chart (the worst!), which I'd guess was due to the absence of a standard library.

>>Smaug1+JO
Fair point. There is a distinction between syntactic efficiency (C is terse) and task-completion efficiency (what the benchmark likely measured). If the tasks involved string manipulation, hash maps, JSON, etc. then C pays a massive token tax because you are implementing what other languages provide in stdlib. Python has dict and json.loads(), C has malloc and strcmp.

So: C tokenizes efficiently for equivalent logic, but stdlib poverty makes it expensive for typical benchmark tasks. Same applies to Factor/Forth, arguably worse.

zlacker