2025: The Year in LLMs

>>simonw+(OP)
Remember, back in the day, when a year of progress was like, oh, they voted to add some syntactic sugar to Java...

>>waldre+T7
That must have been a long time back. Having lived through the time when web pages were served through CGI and mobile phones only existed in movies, when SVMs where the new hotness in ML and people would write about how weird NNs were, I feel like I've seen a lot more concrete progress in the last few decades than this year.

This year honestly feels quite stagnant. LLMs are literally technology that can only reproduce the past. They're cool, but they were way cooler 4 years ago. We've taken big ideas like "agents" and "reinforcement learning" and basically stripped them of all meaning in order to claim progress.

I mean, do you remember Geoffrey Hinton's RBM talk at Google in 2010? [0] That was absolutely insane for anyone keeping up with that field. By the mid-twenty teens RBMs were already outdated. I remember when everyone was implementing flavors of RNNs and LSTMs. Karpathy's character 2015 RNN project was insane [1].

This comment makes me wonder if part of the hype around LLMs is just that a lot of software people simply weren't paying attention to the absolutely mind-blowing progress we've seen in this field for the last 20 years. But even ignoring ML, the world's of web development and mobile application development have gone through incredible progress over the last decade and a half. I remember a time when JavaScript books would have a section warning that you should never use JS for anything critical to the application. Then there's the work in theorem provers over the last decade... If you remember when syntactic sugar was progress, either you remember way further back than I do, or you weren't paying attention to what was happening in the larger computing world.

0. https://www.youtube.com/watch?v=VdIURAu1-aU

1. https://karpathy.github.io/2015/05/21/rnn-effectiveness/

>>crysta+lt
> LLMs are literally technology that can only reproduce the past.

Funny, I've used them to create my own personalized text editor, perfectly tailored to what I actually want. I'm pretty sure that didn't exist before.

It's wild to me how many people who talk about LLM apparently haven't learned how to use them for even very basic tasks like this! No wonder you think they're not that powerful, if you don't even know basic stuff like this. You really owe it to yourself to try them out.

>>handof+It
> You really owe it to yourself to try them out.

I've worked at multiple AI startups in lead AI Engineering roles, both working on deploying user facing LLM products and working on the research end of LLMs. I've done collaborative projects and demos with a pretty wide range of big names in this space (but don't want to doxx myself too aggressively), have had my LLM work cited on HN multiple times, have LLM based github projects with hundreds of stars, appeared on a few podcasts talking about AI etc.

This gets to the point I was making. I'm starting to realize that part of the disconnect between my opinions on the state of the field and others is that many people haven't really been paying much attention.

I can see if recent LLMs are your first intro to the state of the field, it must feel incredible.

>>crysta+bv
Over half of HN still thinks it’s a stochastic parrot and that it’s just a glorified google search.

The change hit us so fast a huge number of people don’t understand how capable it is yet.

Also it certainly doesn’t help that it still hallucinates. One mistake and it’s enough to set someone against LLMs. You really need to push through that hallucinations are just the weak part of the process to see the value.

>>threet+vT
The problem I see, over and over, is that people pose poorly-formed questions to the free ChatGPT and Google models, laugh at the resulting half-baked answers that are often full of errors and hallucinations, and draw conclusions about the technology as a whole.

Either that, or they tried it "last year" or "a while back" and have no concept of how far things have gone in the meantime.

It's like they wandered into a machine shop, cut off a finger or two, and concluded that their grandpa's hammer and hacksaw were all anyone ever needed.

>>Camper+ED1
No, frankly it's the difference between actual engineers and hobbyists/amateurs/non-SWEs.

SWEs are trained to discard surface-level observations and be adversarial. You can't just look at the happy path, how does the system behave for edge cases? Where does it break down and how? What are the failure modes?

The actual analogy to a machine shop would be to look at whether the machines were adequate for their use case, the building had enough reliable power to run and if there were any safety issues.

It's easy to Clever Hans yourself and get snowed by what looks like sophisticated effort or flat out bullshit. I had to gently tell a junior engineer that just because the marketing claims something will work a certain way, that doesn't mean it will.

>>habine+oa2
You sound pretty certain. There's often good money to be made in taking the contrarian view, where you have insights that the so-called "smart money" lacks. What are some good investments to make in the extreme-bear case, in which we're all just Clever Hans-ing ourselves as you put it? Do you have skin in the game?

>>Camper+xb2
My dude, I assure you "humans are really good at convincing themselves of things that are not true" is a very, very well known fact. I don't know what kind of arbitrage you think exists in this incredibly anodyne statement lol.

If you want a financial tip, don't short stock and chase market butterflies. Instead, make real professional friends, develop real skills and learn to be friendly and useful.

I made my money in tech already, partially by being lucky and in the right place at the right time, and partially because I made my own luck by having friends who passed the opportunity along.

Hope that helps!

>>habine+eI3
That answer is basically an admission that you don’t actually hold a strong contrarian belief about the technology at all.

The question wasn’t “are humans sometimes self-delusional?” Everyone agrees with that. The question was whether, in this specific case, the prevailing view about LLM capability is meaningfully wrong in a way that has implications. If you really believed this was mostly Clever Hans, there would be concrete consequences. Entire categories of investment, hiring, and product strategy would be mispriced.

Instead you retreated to “don’t short stocks” and generic career advice. That’s not skepticism, it’s risk-free agnosticism. You get to sound wise without committing to any falsifiable position.

Also, “I made my money already” doesn’t strengthen the argument. It sidesteps it. Being right once, or being lucky in a good cycle, doesn’t confer epistemic authority about a new technology. If anything, the whole point of contrarian insight is that it forces uncomfortable bets or at least uncomfortable predictions.

Engineers don’t evaluate systems by vibes or by motivational aphorisms. They ask: if this hypothesis is true, what would we expect to see? What would fail? What would be overhyped? What would not scale? You haven’t named any of that. You’ve just asserted that people fool themselves and stopped there.

zlacker