Sam claims LLMs aren't sufficient for AGI (rightfully so).
Ilya claims the transformer architecture, with some modification for efficiency, is actually sufficient for AGI.
Obviously transformers are the core component of LLMs today, and the devil is in the details (a future model may resemble the transformers of today, while also being dynamic in terms of training data/experience), but the jury is still out.
In either case, publicly disagreeing on the future direction of OpenAI may be indicative of deeper problems internally.
How the hell can people be so confident about this? You describe two smart people reasonably disagreeing about a complicated topic
Given that AGI means reaching "any intellectual task that human beings can perform", we need a system that can go beyond lexical reasoning and actually contribute (on it's own) to advance our total knowledge. Anything less isn't AGI.
Ilya may be right that a super-scaled transformer model (with additional mechanics beyond today's LLMs) will achieve AGI, or he may be wrong.
Therefore something more than an LLM is needed to reach AGI, what that is, we don't yet know!
Without persistence outside of the context window, they can't even maintain a dynamic, stable higher level goal.
Whether you can bolt something small to these architectures for persistence and do some small things and get AGI is an open question, but what we have is clearly insufficient by design.
I expect it's something in-between: our current approaches are a fertile ground for improving towards AGI, but it's also not a trivial further step to get there.
I mean, can't you say the same for people? We are easily confused and manipulated, for the most part.
I can reason about something and then combine it with something I reasoned about at a different time.
I can learn new tasks.
I can pick a goal of my own choosing and then still be working towards it intermittently weeks later.
The examples we have now of GPT LLM cannot do these things. Doing those things may be a small change, or may not be tractable for these architectures to do at all... but it's probably in-between: hard but can be "tacked on."
I most probably am anthropomorphizing completely wrong. But point is humans may not be any more creative than an LLM, just that we have better computation and inputs. Maybe creativity is akin to LLMs hallucinations.
I would also say that I believe that long-term goal oriented behavior isn't something that's well represented in the training data. We have stories about it, sometimes, but there's a need to map self-state to these stories to learn anything about what we should do next from them.
I feel like LLMs are much smarter than we are in thinking "per symbol", but we have facilities for iteration and metacognition and saving state that let us have an advantage. I think that we need to find clever, minimal ways to build these "looping" contexts.