My AI skeptic friends are all nuts

>>tablet+(OP)
I tried the agent thing on:

- Large C codebase (new feature and bugfix)

- Small rust codebase (new feature)

- Brand new greenfield frontend for an in-spec and documented openAPI API

- Small fixes to an existing frontend

It failed _dramatically_ in all cases. Maybe I'm using this thing wrong but it is devin-level fail. Gets diffs wrong. Passes phantom arguments to tools. Screws up basic features. Pulls in hundreds of line changes on unrelated files to refactor. Refactors again and again, over itself, partially, so that the uncompleted boneyard of an old refactor sits in the codebase like a skeleton (those tokens are also sent up to the model).

It genuinely makes an insane, horrible, spaghetti MESS of the codebase. Any codebase. I expected it to be good at svelte and solidJS since those are popular javascript frameworks with lots of training data. Nope, it's bad. This was a few days ago, Claude 4. Seriously, seriously people what am I missing here with this agents thing. They are such gluttonous eaters of tokens that I'm beginning to think these agent posts are paid advertising.

>>mlsu+ur
Have it make small changes. Restrict it to a single file and scope it to <50 lines or so. Enough that you can easily digest without making it a chore.

>>turtle+5t
A small change scoped to <50 lines is something easy to write for a normal software engineer. When do the LLMs start doing the hard part?

>>declan+qv
A small change around 50 lines is the size of an advent of code solution (the hardest part). Most of the code you write around that is for defensive coding (error handling, malformed input, expected output,…) which is the other hard part. Then you connect these cores to form a system and that’s another tough problem. And it needs to evolve to.

We’ve built tools to help us with the first part, framework with the second, architecture principles with the third and software engineering techniques for the fourth. Where do LLMs help?

zlacker