I'm not necessarily against the approach shown here, reducing tokens for more efficient LLM generation; but if this catches on, humans will read and write it, will write debuggers and tooling for it, etc. It will definitely not be a perfectly hidden layer underneath.
But why not, for programming models, just select tokens that map concisely existing programming languages ? Would that not be as effective ?