zlacker

One of the things I think is going on here is a sort of stone soup effect. [1]

Core to Ptacek's point is that everything has changed in the last 6 months. As you and I presume he agree, the use of off-the-shelf LLMs in code was kinda garbage. And I expect the skepticism he's knocking here ("stochastic parrots") was in fact accurate then.

But it did get a lot of people (and money) to rush in and start trying to make something useful. Like the stone soup story, a lot of other technology has been added to the pot, and now we're moving in the direction of something solid, a proper meal. But given the excitement and investment, it'll be at least a few years before things stabilize. Only at that point can we be sure about how much the stone really added to the soup.

Another counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say.

So I'm still skeptical of the hype. After all, the hype is basically the same as 6 months ago, even though now the boosters can admit the products of 6 months ago sucked. But I can believe we're in the middle of a revolution of developer tooling. Even so, I'm content to wait. We don't know the long term effects on a code base. We don't know what these tools will look like in 6 months. I'm happy to check in again then, where I fully expect to be again told: "If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing." At least until then, I'm renewing my membership in the Boring Technology Club: https://boringtechnology.club/

[1] https://en.wikipedia.org/wiki/Stone_Soup

replies(4): >>xpe+C6 >>keeda+En >>spacem+ox >>DannyB+Va1

>>wpietr+(OP)
Almost by definition, one should be skeptical about hype. So we’re all trying to sort out what is being sold to us.

Different people have different weird tendencies in different directions. Some people irrationally assume that things aren’t going to change much. Others see a trend and irrationally assume that it will continue on a trend line.

Synthesis is hard.

Understanding causality is even harder.

Savvy people know that we’re just operating with a bag of models and trying to choose the right combination for the right situation.

This misunderstanding is one reason why doomers, accelerations, and “normies” talk past each other or (worse) look down on each other. (I’m not trying to claim epistemic equivalence here; some perspectives are based on better information, some are better calibrated than others! I’m just not laying out my personal claims at this point. Instead, I’m focusing on how we talk to each other.)

Another big source of misunderstanding is about differing loci of control. People in positions of influence are naturally inclined to think about what they can do, who they know, and where they want to be. People farther removed feel relatively powerless and tend to hold onto their notions of stability, such as the status quo or their deepest values.

Historically, programmers have been quite willing to learn new technologies, but now we’re seeing widespread examples where people’s plasticity has limits. Many developers cannot (or are unwilling to) wrap their minds around the changing world. So instead of confronting the reality they find ways to deny it, consciously or subconsciously. Our perception itself is shaped by our beliefs, and some people won’t even perceive the threat because it is too strange or disconcerting. Such is human nature: we all do it. Sometimes we’re lucky enough to admit it.

replies(1): >>wpietr+4y1

>>wpietr+(OP)
> Core to Ptacek's point is that everything has changed in the last 6 months.

This was actually the only point in the essay with which I disagree, and it weakens the overall argument. Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you.

I wrote this comment elsewhere: >>44164846 -- Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions, but if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

replies(4): >>potato+pE >>gopher+Je1 >>wpietr+Gx1 >>mwarke+0s2

>>wpietr+(OP)
I’m an amateur coder and I used to rely on Cursor a lot to code when I was actively working on hobby apps about 6 months ago

I picked coding again a couple of days back and I’m blown away by how much things have changed

It was all manual work until a few months back. Suddenly, its all agents

replies(1): >>wpietr+gy1

>>keeda+En
> "Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you."

Sure, but I would argue that the UX is the product, and that has radically improved in the past 6-12 months.

Yes, you could have produced similar results before, manually prompting the model each time, copy and pasting code, re-prompting the model as needed. I would strenuously argue that the structuring and automation of these tasks is what has made these models broadly usable and powerful.

In the same way that Apple didn't event mobile phones nor touchscreens nor OSes, but the specific combination of these things resulted in a product that was different in kind than what came before, and took over the world.

Likewise, the "putting the LLM into a structured box of validation and automated re-prompting" is huge! It changed the product radically, even if its constituent pieces existed already.

[edit] More generally I would argue that 95% of the useful applications of LLMs aren't about advancing the SOTA model capabilities and more about what kind of structured interaction environment we shove them into.

replies(1): >>keeda+qS

>>potato+pE
For sure! I mainly meant to say that people should not attribute the "6 more months until it's really good" point as just another symptom of unfounded hype. It may have taken effort to effectively use AI earlier, which somewhat justified the caution, but now it's significantly easier and caution is counter-productive.

But I think my other point still stands: people will need to figure out for themselves how to fully exploit this technology. What worked for me, for instance, was structuring my code to be essentially functional in nature. This allows for tightly focused contexts which drastically reduces error rates. This is probably orthogonal to the better UX of current AI tooling. Unfortunately, the vast majority of existing code is not functional, and people will have to figure out how to make AI work with that.

A lot of that likely plays into your point about the work required to make useful LLM-based applications. To expand a bit more:

* AI is technology that behaves like people. This makes it confusing to reason about and work with. Products will need to solve for this cognitive dissonance to be successful, which will entail a combination of UX and guardrails.

* Context still seems to be king. My (possibly outdated) experience has been the "right" context trumps larger context windows. With code, for instance, this probably entails standard techniques like static analysis to find relevant bits of code, which some tools have been attempting. For data, this might require eliminating overfetching.

* Data engineering will be critical. Not only does it need to be very clean for good results, giving models unfettered access to the data needs the right access controls which, despite regulations like GDPR, are largely non-existent.

* Security in general will need to be upleveled everywhere. Not only can models be tricked, they can trick you into getting compromised, and so there need to even more guardrails.

A lot of these are regular engineering work that is being done even today. Only it often isn't prioritized because there are always higher priorities... like increasing shareholder value ;-) But if folks want to leverage the capabilities of AI in their businesses, they'll have to solve all these problems for themselves. This is a ton of work. Good thing we have AI to help out!

>>wpietr+(OP)
"nother counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say."

You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I wish i could impress this upon more people.

A friend similarly used to lament/complain that Kotlin sucked in part because we could have probably accomplished it's major features in Java, and maybe without tons of work, or migration cost.

This is maybe even true!

as an intellectual exercise, both are interesting to think about. But outside of that, people get caught up in this as if it matters, but it doesn't.

Basically nothing is driven by pure technical merit alone, not just in CS, but in any field. So my point to him was the lesson to take away from this is not "we could have been more effective or done it cheaper or whatever" but "my definition of effectiveness doesn't match how reality decides effectiveness, so i should adjust my definition".

As much as people want the definition to be a meritocracy, it just isn't and honestly, seems unlikely to ever be.

So while it's 100% true that billions of dollars dumped into other tools or approaches or whatever may have have generated good, better, maybe even amazing results, they weren't, and more importantly, never would have been. Unknown but maybe infinite ROI is often much more likely to see investment than more known but maybe only 2x ROI.

and like i said, this is not just true in CS, but in lots of fields.

That is arguably quite bad, but also seems unlikely to change.

replies(1): >>wpietr+dA4

>>keeda+En
I don't think it's possible to understand what people mean by force multiplier re AI until you use it to teach yourself a new domain and then build something with that knowledge.

Building a mental model of a new domain by creating a logical model that interfaces with a domain I'm familiar with lets me test my assumptions and understanding in real time. I can apply previous experience by analogy and verify usefulness/accuracy instantly.

> Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions[...]

Part of the hype problem is that describing my experience sounds like bullshit to anyone who hasn't gone through the same process. The rate that I pick up concepts well enough to do verifiable work with them is literally unbelievable.

>>keeda+En
>if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

Maybe? Social proof doesn't mean much to me during a hype cycle. You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in. People are extremely good at fooling themselves. There are a lot of extremely smart people following all of the world's major religions, for example, and they can't all be right. And whatever else is going on here, there are a lot of very talented people whose fortunes and futures depend on convincing everybody that something extraordinary is happening here.

I'm glad you have found something that works for you. But I talk with a lot of people who are totally convinced they've found something that makes a huge difference, from essential oils to functional programming. Maybe it does for them. But personally, what works for me is waiting out the hype cycle until we get to the plateau of productivity. Those months that you spent figuring out what worked are months I'd rather spend on using what I've already found to work.

replies(3): >>tptace+dz1 >>kenton+FA1 >>antifa+0O2

>>xpe+C6
I think "the reality", at least as something involving a new paradigm, has yet to be established. I'll note that I heard plenty of similar talk about how developers just couldn't adapt six months or more ago. Promoters now can admit those tools were in fact pretty bad, because they now have something else to promote, but at the time those not rawdogging LLMs were dinosaurs under a big meteor.

I do of course agree that some people are just refusing to "wrap their minds around the changing world". But anybody with enough experience in tech can count a lot more instances of "the world is about to change" than "the world really changed". The most recent obvious example being cryptocurrencies, but there are plenty of others. [1] So I think there's plenty of room here for legitimate skepticism. And for just waiting until things settle down to see where we ended up.

[1] E.g. https://www.youtube.com/watch?v=b2F-DItXtZs

replies(2): >>xpe+vG2 >>deanCo+iL4

>>spacem+ox
> You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

replies(2): >>spacem+B42 >>wpietr+cA4

>>wpietr+Gx1
The problem with this argument is that if I'm right, the hype cycle will continue for a long time before it settles (because this is a particularly big problem to have made a dent in), and for that entire span of time skepticism will have been the wrong position.

replies(2): >>mplanc+VJ2 >>wpietr+jz4

>>wpietr+Gx1
Dude. Claude Code has zero learning curve. You just open the terminal app in your code directory and you tell it what you want, in English. In the time you have spent writing these comments about how you don't care to try it now because it's probably just hype, you could have actually tried it and found out if it's just hype.

replies(2): >>lolind+gG2 >>wpietr+Fz4

>>wpietr+gy1
My 80-year old dad tells me that when he bought his first car, he could pop open the hood and fiddle with things and maybe get it to work after a breakdown

Now he can't - it's too closed and complicated

Yet, modern cars are way better and almost never breakdown

Don't see how LLMs are any different than any other tech advancement that obfuscates and abstracts the "fundamentals".

replies(1): >>wpietr+4A4

>>keeda+En
AI posts (including this one) are all over his employers blog lately, so there’s some stake (fly MCP, https://fly.io/blog/fuckin-robots/, etc).

>>kenton+FA1
I've tried Claude Code repeatedly and haven't figured out how to make it work for me on my work code base. It regularly gets lost, spins out of control, and spends a bunch of tokens without solving anything. I totally sympathize with people who find Claude Code to have a learning curve, and I'm writing this while waiting for Cursor to finish a task I gave it, so it's not like I'm unfamiliar with the tooling in general.

One big problem with Claude Code vs Cursor is that you have to pay for the cost of getting over the learning curve. With Cursor I could eat the subscription fee and then goof off for a long time trying to figure out how to prompt it well. With Claude Code a bad prompt can easily cost me $5 a pop, which (irrationally, but measurably) hurts more than the one-time monthly fee for Cursor.

replies(1): >>kenton+nR2

>>wpietr+4y1
Fair points.

Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions. That is selection bias. Many predicted disruptive changes do occur.

Most importantly, if one wants to be intellectually honest, one has to engage against a set of plausible arguments and scenarios. Debunking one particular company’s hyperbolic vision for the future might be easy, but it probably doesn’t generalize.

It is telling to see how many predictions can seem obvious in retrospect from the right frame of reference. In a sense (or more than that under certain views of physics), the future already exists, the patterns already exist. We just have to find the patterns — find the lens or model that will help the messy world make sense to us.

I do my best to put the hype to the side. I try to pay attention to the fundamentals such as scaling laws, performance over time, etc while noting how people keep moving the goalposts.

Also wrt the cognitive bias aspect: Cryptocurrencies didn’t threaten to apply significant (if any) downward pressure on the software development labor market.

Also, even cryptocurrency proponents knew deep down that it was a chicken and the egg problem: boosters might have said adoption was happening and maybe even inevitable, but the assumption was right out there in the open. It also had the warning signs of obvious financial fraud, money laundering, currency speculation, and ponzi scheming.

Adoption of artificial intelligence is different in many notable ways. Most saliently, it is not a chicken and egg problem: it does not require collective action. Anyone who does it well has a competitive advantage. It is a race.

(Like Max Tegmark and others, I view racing towards superintelligence as a suicide race, not an arms race. This is a predictive claim that can be debated by assessing scenarios, understanding human nature, and assigning probabilities.)

replies(1): >>wpietr+Zz4

>>tptace+dz1
So? The better these tools get, the easier they will be to get value out of. It seems not unwise to let them stabilize before investing the effort and getting the value out, especially if you’re working in one of the areas/languages where they’re still not as useful.

Learning how to use a tool once is easy, relearning how to use a tool every six months because of the rapid pace of change is a pain.

replies(1): >>tptace+s33

>>wpietr+Gx1
> You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in.

While I agree with the skepticism, what specifically is the stake here? Most code assists have usable plans in the $10-$20 range. The investors are apparently taking a much bigger risk than the consumer would be in a case like this.

Aside from the horror stories about people spending $100 in one day of API tokens for at best meh results, of course.

replies(2): >>wpietr+uz4 >>b3mora+HRc

>>lolind+gG2
Claude Code actually has a flat-rate subscription option now, if you prefer that. Personally I've found the API cost to be pretty negligible, but maybe I'm out of touch. (I mean, it's one AI-generated commit, Michael. What could it cost, $5?)

Anyway, if you've tried it and it doesn't work for you, fair enough. I'm not going to tell you you're wrong. I'm just bothered by all the people who are out here posting about AI being bad while refusing to actually try it. (To be fair, I was one of them, six months ago...)

>>mplanc+VJ2
This isn't responsive to what I wrote. Letting the tools stabilize is one thing, makes perfect sense. "Waiting until the hype cycle dies" is another.

replies(1): >>mplanc+Mk3

>>tptace+s33
I suspect the hype cycle and the stabilization curves are relatively in-sync. While the tools are constantly changing, there's always a fresh source of hype, and a fresh variant of "oh you're just not using the right/newest/best model/agent/etc." from those on the hype train.

replies(1): >>tptace+iq3

>>mplanc+Mk3
This is the thing. I do not agree with that, at all. We can just disagree, and that's fine, but let's be clear about what we're disagreeing about, because the whole goddam point of this piece is that nobody in this "debate" is saying the same thing. I think the hype is going to scale out practically indefinitely, because this stuff actually works spookily well. The hype will remain irrational longer than you can remain solvent.

replies(1): >>mplanc+9D3

>>tptace+iq3
Well, generally, that’s just not how hype works.

A thing being great doesn’t mean it’s going to generate outsized levels of hype forever. Nobody gets hyped about “The Internet” anymore, because novel use cases aren’t being discovered at a rapid clip, and it has well and throughly integrated into the general milieu of society. Same with GPS, vaccines, docker containers, Rust, etc., but I mentioned the Internet first since it’s probably on a similar level of societal shift as is AI in the maximalist version of AI hype.

Once a thing becomes widespread and standardized, it becomes just another part of the world we live in, regardless of how incredible it is. It’s only exciting to be a hype man when you’ve got the weight of broad non-adoption to rail against.

Which brings me to the point I was originally trying to make, with a more well-defined set of terms: who cares if someone waits until the tooling is more widely adopted, easy to use, and somewhat standardized prior to jumping on the bandwagon? Not everyone needs to undergo the pain of being an early adopter, and if the tools become as good as everyone says they will, they will succeed on their merits, and not due to strident hype pieces.

I think some of the frustration the AI camp is dealing with right now is because y’all are the new Rust Evangelism Strike Force, just instead of “you’re a bad software engineer if you use a memory unsafe languages,” it’s “you’re a bad software engineer if you don’t use AI.”

replies(2): >>tptace+nF3 >>scott_+oV3

>>mplanc+9D3
Who you calling y'all? I'm a developer who was skeptical about AI until about 6 months ago, and then used it, and am now here to say "this shit works". That's all. I write Go, not Rust.

People have all these feelings about AI hype, and they just have nothing at all to do with what I'm saying. How well the tools work have not much at all to do with the hype level. Usually when someone says that, they mean "the tools don't really work". Not this time.

>>mplanc+9D3
The tools are at the point now that ignoring them is akin to ignoring Stack Overflow posts. Basically any time you'd google for the answer to something, you might as well ask an AI assistant. It has a good chance of giving you a good answer. And given how programming works, it's usually easy to verify the information. Just like, say, you would do with a Stack Overflow post.

>>tptace+dz1
I think it depends a lot on what you think "wrong position" means. I think skepticism only really goes wrong when it refuses to see the truth in what it's questioning long past the point where it's reasonable. I don't think we're there yet. For example, questions like "What is the long term effect on a code base" take us seeing the long term. Or there are legitimate questions about the ROI of learning and re-learn rapidly changing tools. What's worth it to you may not be in other situations.

I also think hype cycles and actual progress can have a variety of relationships. After Bubble 1.0 burst, there were years of exciting progress without a lot of hype. Maybe we'll get something similar here, as reasonable observers are already seeing the hype cycle falter. E.g.: https://www.economist.com/business/2025/05/21/welcome-to-the...

And of course, it all hinges on you being right. Which I get you are convinced of, but if you want to be thorough, you have to look at the other side of it.

replies(1): >>tptace+2C4

>>antifa+0O2
The stake they and I were referring to is a financial interest in the success of AI. Related is the reputational impact, of course. A lot of people who may not make money do like being seen as smart and cutting edge.

But even if we look at your notion of stake, you're missing huge chunks of it. Code bases are extremely expensive assets, and programmers are extremely expensive resources. $10 a month is nothing compared to the costs of a major cleanup or rewrite.

>>kenton+FA1
I could not have, because my standards involve more than a five minute impression from a tool designed to wow people in the first five minutes. Dude.

replies(1): >>kenton+lD5

>>xpe+vG2
> Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions.

I specifically said: "But anybody with enough experience in tech can count a lot more instances of 'the world is about to change' than 'the world really changed'. I pretty clearly understand that sometimes the world does change.

Funnily, I find it suspect when people accuse me of failing to do things I did in the very post they're responding to. So I think this is a fine time for us both to find better ways to spend our time.

replies(1): >>xpe+gB4

>>spacem+B42
I definitely believe that you don't see it. We just disagree on what that implies.

>>wpietr+gy1
Oops, this was a reply to somebody else put in the wrong place. Sorry for the confusion.

>>DannyB+Va1
> You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

replies(1): >>DannyB+tz5

>>wpietr+Zz4
Sorry, I can see why you might take that the wrong way. In my defense, I consciously wrote "generally speaking" in the hopes you wouldn't think I was referring to you in particular. I wasn't trying to accuse you of anything.

I strive to not criticize people indirectly: my style is usually closer to say New York than San Francisco. If I disagree with something in particular, I try to make that clear without beating around the bush.

>>wpietr+jz4
Well, two things. First, I spent a long time being wrong about this; I definitely looked at the other side. Second, the thing I'm convinced of is kind of objective? Like: these things build working code that clears quality thresholds.

But none of that really matters; I'm not so much engaging on the question of whether you are sold on LLM coding (come over next weekend though for the grilling thing we're doing and make your case then!). The only thing I'm engaging on here is the distinction between the hype cycle, which is bad and will get worse over the coming years, and the utility of the tools.

replies(1): >>wpietr+6u5

>>wpietr+4y1
> Promoters now can admit those tools were in fact pretty bad

Relative to what came after, which noone could predict would be guaranteed?

The Model T was in fact pretty bad relative to what came after...

> because they now have something else to promote

something else which is better?

i don't understand the inherent cynicism here.

>>tptace+2C4
Thanks! If I can make it I will. (The pinball museum project is sucking up a lot of my time as we get toward launch. You should come by!)

I think that is one interesting question that I'll want to answer before adoption on my projects, but it definitely isn't the only one.

And maybe the hype cycle will get worse and maybe it won't. Like The Economist, I'm starting to see a turn. The amount of money going into LLMs generally is unsustainable, and I think OpenAI's recent raise is a good example: round 11, $40 billion dollar goal, which they're taking in tranches. Already the largest funding round in history, and it's not the last one they'll need before they're in the black. I could easily see a trough of disillusionment coming in the next 18 months. I agree programming tools could well have a lot of innovation over the next few years, but if that happens against a backdrop of "AI" disillusionment, it'll be a lot easier to see what they're actually delivering.

>>wpietr+dA4
Sure, and that works in the abstract (ie "what investment would theoretically have made the most sense") but if you are trying to compare in the real world you have to be careful because it assumes the alternative would have ever happened. I doubt it would have.

>>wpietr+Fz4
I think you're rationalizing your resistance to change. I've been there!

I have no reason to care whether you use AI or not. I'm giving you this advice just for your sake: Consider whether you are taking a big career risk by avoiding learning about the latest tools of your profession.

>>antifa+0O2
The stakes of changing the way so many people work can't be seen in a short term. Could be good or bad. Probably it will be both, in different ways. Margarine instead of butter seemed like a good idea until we noticed that hydrogenation was worse (in some ways) than the cholesterol problem we were trying to fight.

AI company execs also pretty clearly have a politico-economic idea that they are advancing. The tools may stand on their own but what is the broader effect of supporting them?