The LLM has one job, to make code that looks plausible. That's it. There's no logic gone into writing that bit of code. So the bugs often won't be like those a programmer makes. Instead, they can introduce a whole new class of bug that's way harder to debug.
https://news.ycombinator.com/item?id=44163194
https://news.ycombinator.com/item?id=44068943
It doesn't optimize "good programs". It interprets "humans interpretation of good programs." More accurately, "it optimizes what low paid over worked humans believe are good programs." Are you hiring your best and brightest to code review the LLMs?Even if you do, it still optimizes tricking them. It will also optimize writing good programs, but you act like that's a well defined and measurable thing.
> If the model can write code that passes tests
You think tests make code good? Oh my sweet summer child. TDD has been tried many times and each time it failed worse than the last.I'm not saying don't make tests. But I am saying you're not omniscient. Until you are, your tests are going to be incomplete. They are helpful guides, but they should not drive development. If you really think you can test for every bug then I suggest you apply to be Secretary for health.
https://hackernoon.com/test-driven-development-is-fundamenta...
https://geometrian.com/projects/blog/test_driven_development...
Are you saying you're better than that? If you think you're next to perfect then I understand why you're so against the idea that an imperfect LLM could still generate pretty good code. But also you're wrong if you think you're next to perfect.
If you're not being super haughty, then I don't understand your complaints against LLMs. You seem to be arguing they're not useful because they make mistakes. But humans make mistakes while being useful. If the rate is below some line, isn't the output still good?