zlacker
[parent]
[thread]
1 comments
1. mindwo+(OP)
[view]
[source]
2025-06-03 05:21:53
Not necessarily. If the RL objective is passing tests then in the context of LLMs it means "correct", or at least "correct based on the tests".
replies(1):
>>otabde+o
◧
2. otabde+o
[view]
[source]
2025-06-03 05:25:58
>>mindwo+(OP)
Unfortunately that doesn't solve the problem in any way. We don't have an Oracle machine for testing software.
If we did, we could autogenerate code even without an LLM.
[go to top]