The Illusion of Thinking: Strengths and limitations of reasoning models [pdf]

>>amrrs+(OP)
I think the intuition the authors are trying to capture is that they believe the models are omniscient, but also dim-witted. And the question they are collectively trying to ask is whether this will continue forever.

I've never seen this question quantified in a really compelling way, and while interesting, I'm not sure this PDF succeeds, at least not well-enough to silence dissent. I think AI maximalists will continue to think that the models are in fact getting less dim-witted, while the AI skeptics will continue to think these apparent gains are in fact entirely a biproduct of "increasing" "omniscience." The razor will have to be a lot sharper before people start moving between these groups.

But, anyway, it's still an important question to ask, because omniscient-yet-dim-witted models terminate at "superhumanly assistive" rather than "Artificial Superintelligence", which in turn economically means "another bite at the SaaS apple" instead of "phase shift in the economy." So I hope the authors will eventually succeed.

>>antics+4Q
> I think the intuition the authors are trying to capture is that they believe the models are omniscient, but also dim-witted.

We keep assigning adjectives to this technology that anthropomorphize the neat tricks we've invented. There's nothing "omniscient" or "dim-witted" about these tools. They have no wit. They do not think or reason.

All Large "Reasoning" Models do is generate data that they use as context to generate the final answer. I.e. they do real-time tuning based on synthetic data.

This is a neat trick, but it doesn't solve the underlying problems that plague these models like hallucination. If the "reasoning" process contains garbage, gets stuck in loops, etc., the final answer will also be garbage. I've seen sessions where the model approximates the correct answer in the first "reasoning" step, but then sabotages it with senseless "But wait!" follow-up steps. The final answer ends up being a mangled mess of all the garbage it generated in the "reasoning" phase.

The only reason we keep anthropomorphizing these tools is because it makes us feel good. It's wishful thinking that markets well, gets investors buzzing, and grows the hype further. In reality, we're as close to artificial intelligence as we were a decade ago. What we do have are very good pattern matchers and probabilistic data generators that can leverage the enormous amount of compute we can throw at the problem. Which isn't to say that this can't be very useful, but ascribing human qualities to it only muddies the discussion.

>>imiric+N41
>They have no wit. They do not think or reason.

Computers can't think and submarines can't swim.

>>Kon5ol+Ob1
There’s 2 things going on here.

Output orientation - Is the output is similar to what a human would create if they were to think.

Process orientation - Is the machine actually thinking, when we say its thinking.

I met someone who once drew a circuit diagram from memory. However, they didn’t draw it from inputs, operations, to outputs. They started drawing from the upper left corner, and continued drawing to the lower right, adding lines, triangles and rectangles as need be.

Rote learning can help you pass exams. At some point, it’s a meaningless difference between the utility of “knowing” how engineering works, and being able to apply methods and provide a result.

This is very much the confusion at play here, so both points are true.

1) These tools do not “Think”, in any way that counts as human thinking

2) the output is often the same as what a human thinking, would create.

IF you are concerned with only the product, then what’s the difference? If you care about the process, then this isn’t thought.

To put it in a different context. If you are a consumer, do you care if the output was hand crafted by an artisan, or do you just need something that works.

If you are a producer in competition with others, you care if your competition is selling Knock offs at a lower price.

>>intend+Ho3
> IF you are concerned with only the product, then what’s the difference?

The difference is substantial. If the machine was actually thinking and it understood the meaning of its training data, it would be able to generate correct output based on logic, deduction, and association. We wouldn't need to feed it endless permutations of tokens so that it doesn't trip up when the input data changes slightly. This is the difference between a system with _actual_ knowledge, and a pattern matching system.

The same can somewhat be applied to humans as well. We can all either memorize the answers to specific questions so that we pass an exam, or we can actually do the hard work, study, build out the complex semantic web of ideas in our mind, and acquire actual knowledge. Passing the exam is simply a test of a particular permutation of that knowledge, but the real test is when we apply our thought process to that knowledge and generate results in the real world.

Modern machine learning optimizes for this memorization-like approach, simply because it's relatively easy to implement, and we now have the technical capability where vast amounts of data and compute can produce remarkable results that can fool us into thinking we're dealing with artificial intelligence. We still don't know how to model semantic knowledge that doesn't require extraordinary amounts of resources. I believe classical AI research in the 20th century leaned more towards this direction (knowledge-based / expert systems, etc.), but I'm not well versed in the history.

zlacker