2025: The Year in LLMs

>>waldre+T7
That must have been a long time back. Having lived through the time when web pages were served through CGI and mobile phones only existed in movies, when SVMs where the new hotness in ML and people would write about how weird NNs were, I feel like I've seen a lot more concrete progress in the last few decades than this year.

This year honestly feels quite stagnant. LLMs are literally technology that can only reproduce the past. They're cool, but they were way cooler 4 years ago. We've taken big ideas like "agents" and "reinforcement learning" and basically stripped them of all meaning in order to claim progress.

I mean, do you remember Geoffrey Hinton's RBM talk at Google in 2010? [0] That was absolutely insane for anyone keeping up with that field. By the mid-twenty teens RBMs were already outdated. I remember when everyone was implementing flavors of RNNs and LSTMs. Karpathy's character 2015 RNN project was insane [1].

This comment makes me wonder if part of the hype around LLMs is just that a lot of software people simply weren't paying attention to the absolutely mind-blowing progress we've seen in this field for the last 20 years. But even ignoring ML, the world's of web development and mobile application development have gone through incredible progress over the last decade and a half. I remember a time when JavaScript books would have a section warning that you should never use JS for anything critical to the application. Then there's the work in theorem provers over the last decade... If you remember when syntactic sugar was progress, either you remember way further back than I do, or you weren't paying attention to what was happening in the larger computing world.

0. https://www.youtube.com/watch?v=VdIURAu1-aU

1. https://karpathy.github.io/2015/05/21/rnn-effectiveness/

>>crysta+lt
> LLMs are literally technology that can only reproduce the past.

Funny, I've used them to create my own personalized text editor, perfectly tailored to what I actually want. I'm pretty sure that didn't exist before.

It's wild to me how many people who talk about LLM apparently haven't learned how to use them for even very basic tasks like this! No wonder you think they're not that powerful, if you don't even know basic stuff like this. You really owe it to yourself to try them out.

>>handof+It
> You really owe it to yourself to try them out.

I've worked at multiple AI startups in lead AI Engineering roles, both working on deploying user facing LLM products and working on the research end of LLMs. I've done collaborative projects and demos with a pretty wide range of big names in this space (but don't want to doxx myself too aggressively), have had my LLM work cited on HN multiple times, have LLM based github projects with hundreds of stars, appeared on a few podcasts talking about AI etc.

This gets to the point I was making. I'm starting to realize that part of the disconnect between my opinions on the state of the field and others is that many people haven't really been paying much attention.

I can see if recent LLMs are your first intro to the state of the field, it must feel incredible.

>>crysta+bv
That's all very impressive, to be sure. But are you sure you're getting the point? As of 2025, LLMs are now very good at writing new code, creating new imagery, and writing original text. They continue to improve at a remarkable rate. They are helping their users create things that didn't exist before. Additionally, they are now very good at searching and utilizing web resources that didn't exist at training time.

So it is absurdly incorrect to say "they can only reproduce the past." Only someone who hasn't been paying attention (as you put it) would say such a thing.

>>Camper+Dv
I think the confusion is people's misunderstanding of what 'new code' and 'new imagery' mean. Yes, LLMs can generate a specific CRUD webapp that hasn't existed before but only based on interpolating between the history of existing CRUD webapps. I mean traditional Markov Chains can also produce 'new' text in the sense that "this exact text" hasn't been seen before, but nobody would argue that traditional Markov Chains aren't constrained by "only producing the past".

This is even more clear in the case of diffusion models (which I personally love using, and have spent a lot of time researching). All of the "new" images created by even the most advanced diffusion models are fundamentally remixing past information. This is really obvious to anyone who has played around with these extensively because they really can't produce truly novel concepts. New concepts can be added by things like fine-tuning or use of LoRAs, but fundamentally you're still just remixing the past.

LLMs are always doing some form of interpolation between different points in the past. Yes they can create a "new" SQL query, but it's just remixing from the SQL queries that have existed prior. This still makes them very useful because a lot of engineering work, including writing a custom text editor, involve remixing existing engineering work. If you could have stack-overflowed your way to an answer in the past, an LLM will be much superior. In fact, the phrase "CRUD" largely exists to point out that most webapps are fundamentally the same.

A great example of this limitation in practice is the work that Terry Tao is doing with LLMs. One of the largest challenges in automated theorem proving is translating human proofs into the language of a theorem prover (often Lean these days). The challenge is that there is not very much Lean code currently available to LLMs (especially with the necessary context of the accompanying NL proof), so they struggle to correctly translate. Most of the research in this area is around improving LLM's representation of the mapping from human proofs to Lean proofs (btw, I personally feel like LLMs do have a reasonably good chance of providing major improvements in the space of formal theorem proving, in conjunction with languages like Lean, because the translation process is the biggest blocker to progress).

When you say:

> So it is absurdly incorrect to say "they can only reproduce the past."

It's pretty clear you don't have a solid background in generative models, because this is fundamentally what they do: model an existing probability distribution and draw samples from that. LLMs are doing this for a massive amount of human text, which is why they do produce some impressive and useful results, but this is also a fundamental limitation.

But a world where we used LLMs for the majority of work, would be a world with no fundamental breakthroughs. If you've read The Three Body Problem, it's very much like living in the world where scientific progress is impeded by sophons. In that world there is still some progress (especially with abundant energy), but it remains fundamentally and deeply limited.

>>crysta+Kx
> It's pretty clear you don't have a solid background in generative models, because this is fundamentally what they do

You don’t have a solid background. No one does. We fundamentally don’t understand LLMs, this is an industry and academic opinion. Sure there are high level perspectives and analogies we can apply to LLMs and machine learning in general like probability distributions, curve fitting or interpolations… but those explanations are so high level that they can essentially be applied to humans as well. At a lower level we cannot describe what’s going on. We have no idea how to reconstruct the logic of how an LLM arrived at a specific output from a specific input.

It is impossible to have any sort of deterministic function, process or anything produce new information from old information. This limitation is fundamental to logic and math and thus it will limit human output as well.

You can combine information you can transform information you can lose information. But producing new information from old information from deterministic intelligence is fundamentally impossible in reality and therefore fundamentally impossible for LLMs and humans. But note the keyword: “deterministic”

New information can literally only arise through stochastic processes. That’s all you have in reality. We know it’s stochastic because determinism vs. stochasticism are literally your only two viable options. You have a bunch of inputs, the outputs derived from it are either purely deterministic transformations or if you want some new stuff from the input you must apply randomness. That’s it.

That’s essentially what creativity is. There is literally no other logical way to generate “new information”. Purely random is never really useful so “useful information” arrives only after it is filtered and we use past information to filter the stochastic output and “select” something that’s not wildly random. We also only use randomness to perturb the output a little bit so it’s not too crazy.

In the end it’s this selection process and stochastic process combined that forms creativity. We know this is a general aspect of how creativity works because there’s literally no other way to do it.

LLMs do have stochastic aspects to them so we know for a fact it is generating new things and not just drawing on the past. We know it can fit our definition of “creative” and we can literally see it be creative in front of your eyes.

You’re ignoring what you see with your eyes and drawing your conclusions from a model of LLMs that isn’t fully accurate. Or you’re not fully tying the mechanisms of how LLMs work with what creativity or generating new data from past data is in actuality.

The fundamental limitation with LLMs is not that it can’t create new things. It’s that the context window is too small to create new things beyond that. Whatever it can create it is limited to the possibilities within that window and that sets a limitation on creativity.

What you see happening with LEAN can also be an issue with the context window being too small. If we have an LLM with a giant context window bigger than anything before… and pass it all the necessary data to “learn” and be “trained” on lean it can likely start to produce new theorems without literally being “trained”.

Actually I wouldn’t call this a “fundamental” problem. More fundamental is the aspect of hallucinations. The fact that LLMs produce new information from past information in the WRONG way. Literally making up bullshit out of thin air. It’s the opposite problem of what you’re describing. These things are too creative and making up too much stuff.

We have hints that LLMs know the difference between hallucinations and reality but coaxing it to communicate that differentiation to us is limited.

zlacker