That's an absolutely fair point that vocabularies differ regarding the tokenizer variance, but the symbols GlyphLang uses are ASCII characters that tokenize as single tokens across GPT4, Claude, and Gemini tokenizers. THe optimization isn't model-specific, but rather it's targeting the common case of "ASCII char = 1 token". I could definitely reword my post though - looking at it more closely, it does read more as "fix-all" rather than "fix-most".
Regardless, I'd genuinely be interested in seeing failure cases. It would be incredibly useful data to see if there are specific patterns where symbol density hurts comprehension.
Way back in the gpt3.5 days I could never get the model to do a parse of even the simplest grammar until I replaced the one letter production rules with one word production rules, e.g. S vs Start. A bit like how they couldn't figure out the number of rs in strawberry.
You want to be as state free as possible. Your tokenizer should match your vocab and be unambiguous. I think your goal is sound, but golfing for the wrong metric.