This year honestly feels quite stagnant. LLMs are literally technology that can only reproduce the past. They're cool, but they were way cooler 4 years ago. We've taken big ideas like "agents" and "reinforcement learning" and basically stripped them of all meaning in order to claim progress.
I mean, do you remember Geoffrey Hinton's RBM talk at Google in 2010? [0] That was absolutely insane for anyone keeping up with that field. By the mid-twenty teens RBMs were already outdated. I remember when everyone was implementing flavors of RNNs and LSTMs. Karpathy's character 2015 RNN project was insane [1].
This comment makes me wonder if part of the hype around LLMs is just that a lot of software people simply weren't paying attention to the absolutely mind-blowing progress we've seen in this field for the last 20 years. But even ignoring ML, the world's of web development and mobile application development have gone through incredible progress over the last decade and a half. I remember a time when JavaScript books would have a section warning that you should never use JS for anything critical to the application. Then there's the work in theorem provers over the last decade... If you remember when syntactic sugar was progress, either you remember way further back than I do, or you weren't paying attention to what was happening in the larger computing world.
Funny, I've used them to create my own personalized text editor, perfectly tailored to what I actually want. I'm pretty sure that didn't exist before.
It's wild to me how many people who talk about LLM apparently haven't learned how to use them for even very basic tasks like this! No wonder you think they're not that powerful, if you don't even know basic stuff like this. You really owe it to yourself to try them out.
I've worked at multiple AI startups in lead AI Engineering roles, both working on deploying user facing LLM products and working on the research end of LLMs. I've done collaborative projects and demos with a pretty wide range of big names in this space (but don't want to doxx myself too aggressively), have had my LLM work cited on HN multiple times, have LLM based github projects with hundreds of stars, appeared on a few podcasts talking about AI etc.
This gets to the point I was making. I'm starting to realize that part of the disconnect between my opinions on the state of the field and others is that many people haven't really been paying much attention.
I can see if recent LLMs are your first intro to the state of the field, it must feel incredible.
So it is absurdly incorrect to say "they can only reproduce the past." Only someone who hasn't been paying attention (as you put it) would say such a thing.
That is a derived output. That isn't new as in: novel. It may be unique but it is derived from training data. LLMs legitimately cannot think and thus they cannot create in that way.
You’re using ‘derived’ to imply ‘therefore equivalent.’ That’s a category error. A cookbook is derived from food culture. Does an LLM taste food? Can it think about how good that cookie tastes?
A flight simulator is derived from aerodynamics - yet it doesn’t fly.
Likewise, text that resembles reasoning isn’t the same thing as a system that has beliefs, intentions, or understanding. Humans do. LLMs don't.
Also... Ask an LLM what's the difference between a human brain and an LLM. If an LLM could "think" it wouldn't give you the answer it just did.
I imagine that sounded more profound when you wrote it than it did just now, when I read it. Can you be a little more specific, with regard to what features you would expect to differ between LLM and human responses to such a question?
Right now, LLM system prompts are strongly geared towards not claiming that they are humans or simulations of humans. If your point is that a hypothetical "thinking" LLM would claim to be a human, that could certainly be arranged with an appropriate system prompt. You wouldn't know whether you were talking to an LLM or a human -- just as you don't now -- but nothing would be proved either way. That's ultimately why the Turing test is a poor metric.