>>utopia+(OP)
A million times, this. Sometimes they luck into the intent, but much more frequently they end up in a ball of mud that just happens to pass the tests.
"8 unit tests? Great, I'll code up 8 branches so all your tests pass!" Of course that neglects the fact that there's now actually 2^8 paths through your code.
>>utopia+(OP)
if you can steer an LLM to write an application based on what you want, you can steer an LLM to write the tests you want. Some people will be better at getting the LLM to write tests, but it's only going to get easier and easier
>>utopia+(OP)
What makes you think the next generation models won't be explicitly trained to prevent this, or any other pitfall or best practice as the low hanging fruit fall one by one?