zlacker

[return to "The Illusion of Thinking: Strengths and limitations of reasoning models [pdf]"]
1. avstee+Sy3[view] [source] 2025-06-08 14:36:02
>>amrrs+(OP)
People are drawing erroneous conclusions from this.

My read of this is that the paper demonstrates that given a particular model (and the problems examined with it) that giving more thought tokens does not help on problems above a certain complexity. It does not say anything about the capabilities of future, larger, models to handle more complex tasks. (NB: humans trend similarly)

My concern is that people are extrapolating from this to conclusions about LLM's generally, and this is not warranted

The only part about this i find even surprising is he abstract's conclusion (1): that 'thinking' can lead to worse outcomes for certain simple problem. (again though, maybe you can say humans are the same here. You can overthink things)

◧◩
2. lukev+Dz3[view] [source] 2025-06-08 14:42:37
>>avstee+Sy3
You can absolutely extrapolate the results, because what this shows is that even when "reasoning" these models are still fundamentally repeating in-sample patterns, and that they collapse when faced with novel reasoning tasks above a small complexity threshold.

That is not a model-specific claim, it's a claim on the nature of LLMs.

For your argument to be true would need to mean that there is a qualitative difference, in which some models possess "true reasoning" capability and some don't, and this test only happened to look at the latter.

◧◩◪
3. avstee+vB3[view] [source] 2025-06-08 15:01:09
>>lukev+Dz3
The authors don't say anything like this that I can see. Their conclusion specifically identifies these as weaknesses of current frontier models.

Furthermore we have clearly seen increases in reasoning from previous frontier models to current frontier models.

If the authors could /did show that both previous-generation and current-generation frontier models hit a wall at similar complexity that would be something, AFAIK they do not.

[go to top]