zlacker

Doubt it, code will be generated to pass tests, not the intent behind the tests.

replies(4): >>daxfoh+sn >>krashi+Ky >>andrew+yN >>Art968+qR1

>>utopia+(OP)
A million times, this. Sometimes they luck into the intent, but much more frequently they end up in a ball of mud that just happens to pass the tests.

"8 unit tests? Great, I'll code up 8 branches so all your tests pass!" Of course that neglects the fact that there's now actually 2^8 paths through your code.

>>utopia+(OP)
if you can steer an LLM to write an application based on what you want, you can steer an LLM to write the tests you want. Some people will be better at getting the LLM to write tests, but it's only going to get easier and easier

>>utopia+(OP)
I think we agree - getting the llms to understand your intent is the hard part, at the very least you need well specified tests.

Perhaps more advanced llms + specifications + better tests.

>>utopia+(OP)
What makes you think the next generation models won't be explicitly trained to prevent this, or any other pitfall or best practice as the low hanging fruit fall one by one?