I think what's happening is two groups using "productivity" to mean completely different things: "I can implement 5x more code changes" vs "I generate 5x more business value." Both experiences are real, but they're not the same thing.
https://peoplesgrocers.com/en/writing/ai-productivity-parado...
This is true, LLMs can speed up development (some asterisks are required here, but that is generally true).
That said, I've seen, mainly here on HN, so many people hyping it up way beyond this. I've got into arguments here with people claiming it codes at "junior level". Which is an absurd level of bullshit.
Sure, they might help you onboard into a complex codebase, but that's about it.
They help in breadth, not depth, really. And to be clear, to me that's extremely helpful, cause working on "depth" is fun and invigorating, while working on "breadth" is more often than not a slog, which I'm happy to have Claude Code write up a draft for in 15 minutes, review, do a bunch of tweaks, and be done with.
On the flip side, it has allowed me to accomplish many lower-complexity backlog projects that I just wouldn’t have even attempted before. It expands productivity on the low end.
I’ve also used it many times to take on quality-of-life tasks that just would have been skipped before (like wrapping utility scripts in a helpful, documented command-line tool).
This has been my experience at well - AI coding tools are like a very persistent junior-- that loves reading specs and documentation. The problem for AI companies is "automated burndown of your low-complexity backlog items" isn't a moneymaker, even though that's what we have. So they have to sell a dream that may be realized, or may not.
The benchmark project in the article is the perfect candidate for AI: well defined requirements with precise technical terms (RFCs), little room for undefined behavior and tons of reference implementations. This is an atypical project. I am confident AI agent write an HTTP2 server, but it will also repeatedly fail to write sensible tests for human/business processes that a junior would excel at.
There's certainly a lot of code that needs to be written in companies that is simple and straightforward and where LLMs are absolutely capable of generating code as good as your average junior/intermediate developer would have written.
And of course there are higher complexity tasks where the LLM will completely face plant.
So the smart company chooses carefully where to apply the LLM and possibly does get 5x more code that is "better" in the sense that there's 5x more straightforward tickets closed/shipped, which is better than if they had less tickets closed/shipped.
However, the expansion in scope that senior developers can tackle now will take away work that would ordinarily be given to juniors.
A big part of my skepticism is this offloading of responsibility: you can use an AI tool to write large quantities of shitty code and make yourself look superficially productive at the cost of the reviewer. I don't want to review 13 PRs, all of which are secretly AI but pretend to be junior dev output, none of which solve any of the most pressing business problems because they're just pointless noise from the bowels of our backlog, and have that be my day's work.
Such gatekeeping is a distraction from my actual job, which is to turn vague problem descriptions into an actionable spec by wrangling with the business and doing research, and then fix them. The wrangling sees a 0% boost from AI, the research is only sped up slightly, and yeah, maybe the "fixing problems" part of the job will be faster! That's only a fraction of the average day for me, though. If an LLM makes the code I need to review worse, or if it makes people spend time on the kind of busywork that ended up 500 items down in our backlog instead of looking for more impactful tasks, then it's a net negative.
I think what you're missing is the risk, real or imagined, of AI generating 5x more code changes that have overall negative business value. Code's a liability. Changes to it are a risk.
That maybbe true, and would be an interesting topic do discuss if people actually spoke in such a way.
"Developers are now more productive in a way that many projects may need less developers to keep up productivity levels" is not that catchy to generate hype however.
My intuition is the tail of low value "changes/edits" will skew fairly code size neutral.
A concrete example from this week "adding robust error handling" in TypeScript.
I ask the LLM to look at these files. See how there is a big try catch, and now I have the code working, there are two pretty different failure domains inside. Can you split up the try catch (which means hoisting some variable declarations outside the block scope).
This is a cursor rule for me `@split-failure-domain.mdc` because of how often this comes up (make some RPCs then validate desired state transition)
Then I update the placeholder comment with my prediction of the failure rate.
I "changed" the code, but the diff is +9/-6.
When I'm working on the higher complexity problems I tend to be closer to the edge of my understanding. Once I get a solution, very often I can simplify the code. There are many many ways to write the same exact program. Fewer make the essential complexity obvious. And when you shift things around in exactly the kind of mechanical transformation way that LLMs can speed up... then your diff is not that big. Might be negative.