zlacker

As someone who has spent the better part of today fixing the utter garbage produced by repeated iteration with these supposedly magical coding agents, I'm neither in the camp of the "AI skeptic" (at least as defined by the author), nor am I in the camp of people who thinks these things can "write a large fraction of all the tedious code you’ll ever need to write."

Maybe I'm doing it wrong, but I seem to have settled on the following general algorithm:

* ask the agent to green-field a new major feature.

* watch the agent spin until it is satisfied with its work.

* run the feature. Find that it does not work, or at least has major deficiencies [1]

* cycle through multiple independent iterations with the agent, doing something resembling "code review", fixing deficiencies one at a time [2]

* eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.

Repeat.

It's not that the things are useless or "a fad" -- they're clearly very useful. But the people who are claiming that programmers are going to be put out of business by bots are either a) talking their book, or b) extrapolating wildly into the unknown future. And while I am open to the argument that (b) might be true, what I am observing in practice is that the rate of improvement is slowing rapidly, and/or the remaining problems are getting much harder to solve.

[1] I will freely grant that at least some of these major deficiencies typically result from my inability / unwillingness to write a detailed enough spec for the robot to follow, or anticipate every possible problem with the spec I did bother to write. T'was ever thus...

[2] This problem is fractal. However, it's at least fun, in that I get to yell at the robot in a way that I never could with a real junior engineer. One Weird Fact about working with today's agents is that if you threaten them, they seem to do better work.

replies(5): >>therea+B1 >>arturs+C1 >>zaptre+W1 >>yodsan+32 >>andrep+C6

>>timr+(OP)
Results can vary significantly, and in my experience, both the choice of tools and models makes a big difference.

It’s a good idea to periodically revisit and re-evaluate AI and tooling. I’ve noticed that many programmers tried AI when, for example, GPT-3.5 was first released, became frustrated, and never gave it another chance—even though newer models like o4-mini are now capable of much more, especially in programming tasks.

AI is advancing rapidly. With the latest models and the right tools, what’s possible today far exceeds what was possible even just a short time ago (3-12 months ago even).

Take a look at Cursor or Windsurf or Roo code or aider to "feed" AI with code and take a look at models like Google Gemini 2.5 Pro, Claude Sonnet 4, OpenAI o4mini. Also educate yourself about agents and MCP. Soon that will be standard for many/every programmer.

replies(1): >>timr+26

>>timr+(OP)
Which model? Are you having it write unit tests first? How large of a change at a time are you asking for? How specific are your prompts?

>>timr+(OP)
Even on stuff it has no chance of doing on its own, I find it useful to basically git reset repeatedly and start with more and more specific instructions. At the very least it helps me think through my plan better.

replies(1): >>timr+B4

>>timr+(OP)
My workflow is similar. While the agent is running, I browse the web or day dream. If I'm lucky, the agent produced correct code (after possibly several cycles). If I'm not, I need to rewrite everything myself. I'm also not in any camp and I genuinely don't know if I'm more or less productive overall. But I think that a disciplined use of a well-integrated agent will make people more productive.

>>zaptre+W1
Yeah...I've toyed with that, but there's still a productivity maximum where throwing it all away and starting from scratch is a worse idea, probabilistically, than just fixing whatever thing is clearly wrong.

Just to make it concrete, today I spent a few hours going through a bunch of HTML + embedded styles and removing gobs and gobs of random styles the LLMs glommed on that "worked", but was brittle and failed completely as soon as I wanted to do something slightly different than the original spec. The cycle I described above led to a lot of completely unnecessary markup, paired with unnecessary styles to compensate for the crappiness of the original DOM. I was able to refactor to a much saner overall structure, but it took some time and thinking. Was I net ahead? I don't really know.

Given that LLMs almost always write this kind of "assembled from StackOverflow" code, I have precisely 0% confidence that I'd end up in a better place if I just reset the working branch and started from scratch.

It kind of reminds me of human biology -- given billions of years of random evolution you can end up with incredible sophistication, but the end result will be incomprehensible and nearly impossible to alter.

>>therea+B1
I am using all of the models you're talking about, and I'm using agents, as I mentioned.

There is no magic bullet.

>>timr+(OP)
> eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.

As it stands AI can't even get out of Lt Surge's gym in Pokemon Red. When an AI manages to beat Lance I'll start to think about using it for writing my code :-)