zlacker

[parent] [thread] 0 comments
1. wrs+(OP)[view] [source] 2026-01-01 04:02:27
Now that we know code is a killer app for LLMs, why would we keep tokenizing code as if it were human language? I would expect someone's fixing their tokenizer to densify existing code patterns for upcoming training runs (and make them more semantically aligned).
[go to top]