zlacker
[parent]
[thread]
0 comments
1. energy+(OP)
[view]
[source]
2025-05-06 15:53:40
It probably increases scores in the RL training since it's a kind of locally specific reasoning that would reduce bugs.
Which means if you try to force it to stop, the code quality will drop.
[go to top]