Given the model is probabilistic and does many things in parallel, its output can be understood as a mixture, e.g. 30% trash, 60% rehashed training material, 10% reasoning.
People probe model in different ways, they see different results, and they make different conclusions.
E.g. somebody who assumes AI should have impeccable logic will find "trash" content (e.g. incorrectly retrieved memory) and will declare that the whole AI thing is overhyped bullshit.
Other people might call model a "stochastic parrot" as they recognize it basically just interpolates between parts of the training material.
Finally, people who want to probe reasoning capabilities might find it among the trash. E.g. people found that LLMs can evaluate non-trivial Python code as long as it sends intermediate results to output: https://x.com/GrantSlatton/status/1600388425651453953
I interpret "feel the AGI" (Ilya Sutskever slogan, now repeated by Jan Leike) as a focus on these capabilities, rather than on mistakes it makes. E.g. if we go from 0.1% reasoning to 1% reasoning it's a 10x gain in capabilities, while to an outsider it might look like "it's 99% trash".
In any case, I'd rather trust intuition of people like Ilya Sutskever and Jan Leike. They aren't trying to sell something, and overhyping the tech is not in their interest.
Regarding "missing something really critical", it's obvious that human learning is much more efficient than NN learning. So there's some algorithm people are missing. But is it really required for AGI?
And regarding "It cannot reason" - I've seen LLMs doing rather complex stuff which is almost certainly not in the training set, what is it if not reasoning? It's hard to take "it cannot reason" seriously from people