zlacker

[return to "Qwen3-Coder-Next"]
1. zamada+a5[view] [source] 2026-02-03 16:22:56
>>daniel+(OP)
Can anyone help me understand the "Number of Agent Turns" vs "SWE-Bench Pro (%)" figure? I.e. what does the spread of Qwen3-Coder-Next from ~50 to ~280 agent turns represent for a fixed score of 44.3%: that sometimes it takes that spread of agent turns to achieve said fixed score for the given model?
◧◩
2. edude0+S6[view] [source] 2026-02-03 16:29:07
>>zamada+a5
Essentially the more turns you have the more the agent is likely to fail since the error compounds per turn. Agentic model are tuned for “long horizon tasks” ie being able to go many many turns on the same problem without failing.
[go to top]