And those people are the people that develop the body of material that later people (and now LLMs) learn from.
A language doesn't have to be unique to still have a particular taste associated with its patterns and idioms, and it would unfortunate if LLM influence had the effect of suppressing the ability for that new style to develop.
The advantage rather for llms in strongly typed languages is that compilers can catch errors early and give the model early automated feedback so you don’t have to.
With weakly typed (and typically interpreted) languages they will need to run the code which maybe quite slow to do so or not realistic.
Simply put agentic coding loops prefer stronger static analysis capabilities.
Contrast with the likes of Swift - been around for years but it’s so bloated and obscure that coding agents (not just humans) have problems using it fully.
The only problem I’ve ever had was on maybe 3 total occasions it’s added a return statement, I assume because of the syntax similarity with ruby
also, some nonstatic languages have a habit of having least surprise in their codebases -- it's often possible to effectively guess the types flowing through at the callsite. zero refactoring feedback necessary is better than even one.
But those are exactly the same mistakes most humans make when writing bash scripts, which makes them inherently flaky.
Ask it to write code in a language with types, a “logical” syntax where there are no tricky gotchas, with strict types, and a compiler which enforces those rules, and while LLMs struggle to begin with, they eventually produce code which is nearly clean and bug free. Works much better if there is an existing codebase where they can observe and learn from existing patterns.
On the other hand asking them to write JavaScript and Python, sure they fly, but they confidently implement code full of hidden bugs.
The whole “amount of training data” is completely overblown. I’ve seen code do well even with my own made up DSL. If the rules are logical and you explain the rules to it and show it existing patterns, the can mostly do alright. Conversely there is so much bad JavaScript and Python code in their training data that I struggle to get them to produce code in my style in these languages.
I have similar concerns to you - how well a language works with LLMs is indeed an issue we have to consider. But why do you assume that it's the volume of training data that drives this advantage? Another assumption, equally if not more valid IMO, is that languages which have fewer, well-defined, simpler constructs are easier for LLMs to generate.
Languages with sprawling complexity, where edge cases dominate dev time, all but require PBs of training data to be feasible.
Languages that are simple (objectively), with a solid unwavering mental model, can match LLMs strengths - and completely leap-frog the competition in accurate code gen.
it seems semi intuitive to me that a typesafe, functional programming language with referential transparency would be ideal if you could decompose a program to small components and code those.
once you have a referentially transparent function with input/out tests you can spin on that forever until its solved and then be sure that it works.