zlacker

[parent] [thread] 1 comments
1. mindwo+(OP)[view] [source] 2025-06-03 05:21:53
Not necessarily. If the RL objective is passing tests then in the context of LLMs it means "correct", or at least "correct based on the tests".
replies(1): >>otabde+o
2. otabde+o[view] [source] 2025-06-03 05:25:58
>>mindwo+(OP)
Unfortunately that doesn't solve the problem in any way. We don't have an Oracle machine for testing software.

If we did, we could autogenerate code even without an LLM.

[go to top]