My understanding/experience is that LLM performance in a language scales with how well the language is represented in the training data.
From that assumption, we might expect LLMs to actually do better with an existing language for which more training code is available, even if that language is more complex and seems like it should be “harder” to understand.
(Which is true - it's easy to prompt your LLM with the language grammar, have it generate code and then RL on that)
Easy in the sense of "it is only having enough GPUs to RL a coding capable LLM" anyway.
The point is that your statement about the ability to do RL is wrong.
Additionally your response to the Deepseek paper in the other subthread shows profound and deliberate ignorance.
GPU poor here though...
To quote someone (you...) on the internet:
> More generally, don't ask random people on the internet to do work for you for free.