zlacker

[parent] [thread] 20 comments
1. tptace+(OP)[view] [source] 2025-05-14 23:31:43
The first and most important question to ask here is: are you using a coding agent? A lot of times, people who aren't getting much out of LLM-assisted coding are just asking Claude or GPT for code snippets, and pasting and building them themselves (or, equivalently, they're using LLM-augmented autocomplete in their editor).

Almost everybody doing serious work with LLMs is using an agent, which means that the LLM is authoring files, linting them, compiling them, and iterating when it spots problems.

There's more to using LLMs well than this, but this is the high-order bit.

replies(4): >>__mhar+i1 >>lexand+W6 >>WD-42+B8 >>khazho+jd
2. __mhar+i1[view] [source] 2025-05-14 23:44:38
>>tptace+(OP)
What agent do you recommend?
replies(6): >>tptace+43 >>haiku2+Z3 >>kasey_+l4 >>physix+yc >>kbaker+di >>theshr+Yy
◧◩
3. tptace+43[view] [source] [discussion] 2025-05-15 00:01:18
>>__mhar+i1
I think they're all fine. Cursor is popular and charges a flat fee for model calls (interposed through their model call router, however that works). Aider is probably the most popular open source command line one. Claude Code is probably the most popular command line agent overall; Codex is the OpenAI equivalent (I like Codex fine).

later

Oh, I like Zed a lot too. People complain that Zed's agent (the back-and-forth with the model) is noticeably slower than the other agents, but to me, it doesn't matter: all the agents are slow enough that I can't sit there and wait for them to finish, and Zed has nice desktop notifications for when the agent finishes.

Plus you get a pretty nice editor --- I still write exclusively in Emacs, but I think of Zed as being a particularly nice code UI for an LLM agent.

◧◩
4. haiku2+Z3[view] [source] [discussion] 2025-05-15 00:12:31
>>__mhar+i1
I've been having okayish results with Zed + Claude 3.7
◧◩
5. kasey_+l4[view] [source] [discussion] 2025-05-15 00:18:41
>>__mhar+i1
Speaking up for Devin.ai here. What I like about it is that after the initial prompt nearly all of my interaction with it is via pull request comments.

I have this workflow where I trigger a bunch of prompts in the morning, lunch and at the end of the day. At those same times I give it feedback. The async nature really means I can have it work on things I can’t be bothered with myself.

replies(1): >>tptace+K5
◧◩◪
6. tptace+K5[view] [source] [discussion] 2025-05-15 00:34:35
>>kasey_+l4
I need to know more about the morning/lunch/evening prompts, and I need to know right now. What are they? This sounds amazing.
replies(1): >>kasey_+Ed
7. lexand+W6[view] [source] 2025-05-15 00:47:07
>>tptace+(OP)
Funny, I would give the absolute opposite advice. In my experience, the use of agents (mainly Cursor) is a sure-fire way to have a really painful experience with LLM-assisted coding. I much prefer to use AI as a pair programmer, that I talk to and sometimes get to write entire files, but I'm always the one doing the driving, and mostly the one writing the code.

If you aren't building up mental models of the problem as you go, you end up in a situation where the LLM gets stuck at the edges of its capability, and you have no idea how even to help it overcome the hurdle. Then you spend hours backtracking through what it's done building up the mental model you need, before you can move on. The process is slower and more frustrating than not using AI in the first place.

I guess the reality is, your luck with AI-assisted coding really comes down to the problem you're working on, and how much of it is prior art the LLM has seen in training.

replies(4): >>tptace+Y8 >>mnoron+Qf >>theshr+Ry >>tom_m+mz1
8. WD-42+B8[view] [source] 2025-05-15 01:05:23
>>tptace+(OP)
It’s because the people doing rote programming with them don’t think they are doing rote programming, they think it’s exceptional.
◧◩
9. tptace+Y8[view] [source] [discussion] 2025-05-15 01:10:12
>>lexand+W6
I mean, it might depend, but many of the most common complaints about LLM coding (most notably hallucination) are essentially solved problems if you're using agents. Whatever works for you! I don't even like autocomplete, so I sympathize with not liking agents.

If it helps, for context: I'll go round and round with an agent until I've got roughly what I want, and then I go through and beat everything into my own idiom. I don't push code I don't understand and most of the code gets moved or reworked a bit. I don't expect good structure from LLMs (but I also don't invest the time to improve structure until I've done a bunch of edit/compile/test cycles).

I think of LLMs mostly as a way of unsticking and overcoming inertia (and writing tests). "Writing code", once I'm in flow, has always been pleasant and fast; the LLMs just get me to that state much faster.

I'm sure training data matters, but I think static typing and language tooling matters much more. By way of example: I routinely use LLMs to extend intensely domain-specific code internal to our project.

replies(2): >>overfe+GB >>disgru+lS
◧◩
10. physix+yc[view] [source] [discussion] 2025-05-15 01:46:51
>>__mhar+i1
I use Augment Code as a plugin in IntelliJ and PyCharm. It's quite good, but I only use it for narrow, targeted objectives, agent mode or not.

I haven't seen any mentions of Augment code yet in comment threads on HN. Does anyone else use Augment Code?

11. khazho+jd[view] [source] 2025-05-15 01:54:35
>>tptace+(OP)
My sweet spot is Cursor to generate/tweak code, but I do all the execution and debugging iteration myself.
◧◩◪◨
12. kasey_+Ed[view] [source] [discussion] 2025-05-15 01:57:21
>>tptace+K5
Oh they aren’t like time based instructions or anything. First thing I do when I sit down in the morning is go through the list of tasks I thought up overnight and fire devin at them. Then I go do whatever “real” work I needed to get done. Then at lunch I check in to see how things are going and give feedback or new tasks. Same as the last thing I do at night.

It keeps _me_ from context switching into agent manager mode. I do the same thing for doing code reviews for human teammates as well.

replies(1): >>tptace+Rd
◧◩◪◨⬒
13. tptace+Rd[view] [source] [discussion] 2025-05-15 02:00:00
>>kasey_+Ed
Right, no, I figured that! Like the idea of preloading a bunch of things into a model that I don't have the bandwidth to sort through, but having them on tap when I come up for air from whatever I'm currently working on, sounds like a super good trick.
replies(1): >>kasey_+Fg
◧◩
14. mnoron+Qf[view] [source] [discussion] 2025-05-15 02:29:49
>>lexand+W6
Agree. My favorite workflow has been chatting with the LLM in the assistant panel of Zed, then making inline edits by prompting the AI with the context of that chat. That way, I can align with the AI on how the problem should be solved before letting it loose. What's great about this depending on how easy or hard the problem is for the LLM, I can shift between handholding / manual coding and vibe coding.
◧◩◪◨⬒⬓
15. kasey_+Fg[view] [source] [discussion] 2025-05-15 02:42:12
>>tptace+Rd
That’s kind of where Devin excels. The agent itself is good enough, I don’t even know what model it uses. But it’s hosted and well integrated with GitHub, so you just give it a prompt and out shoots a pr sometime later. You comment on the pr and it refines it. It has a concept of “sessions” so you can start many of those tasks at once. You can login to each of its tasks and see what it is doing or interdict, but I rarely do.

Like most of the code agents it works best with tight testable loops. But it has a concept of short vs long tests and will give you plans as nd confidence values to help you refine your prompt if you want.

I tend to just let it go. If it gets to a 75% done spot that isn’t worth more back and forth I grab the pr and finish it off.

◧◩
16. kbaker+di[view] [source] [discussion] 2025-05-15 02:59:47
>>__mhar+i1
Try https://aider.chat + OpenRouter.ai, pay-as-you-go, use any model you want, I use Claude Sonnet.

It has a very good system prompt so the code is pretty good without a lot of fluff.

◧◩
17. theshr+Ry[view] [source] [discussion] 2025-05-15 06:41:33
>>lexand+W6
Agents make it easier for you to give context to the LLM, or for it to grab some by itself like Cline/Claude/Cursor/Windsurf can do.

With a web-based system you need repomix or something similar to give the whole project (or parts of it if you can be bothered to filter) as context, which isn't exactly nifty

◧◩
18. theshr+Yy[view] [source] [discussion] 2025-05-15 06:43:22
>>__mhar+i1
I've settled on Cline for now, with openrouter as the backend for LLMs, Gemini 2.5 for planning and Claude 3.7 for act mode.

Cursor is fine, Claude Code and Aider are a bit too janky for me - and tend to go overboard (making full-ass git commits without prompting) and I can't be arsed to rein them in.

◧◩◪
19. overfe+GB[view] [source] [discussion] 2025-05-15 07:18:29
>>tptace+Y8
> ...many of the most common complaints about LLM coding (most notably hallucination) are essentially solved problems if you're using agents

Inconsistency and crap code quality aren't solved yet, and these make the agent workflow worse because the human only gets to nudge the AI in the right direction very late in the game. The alternative, interactive, non-agentic workflows allow for more AI-hand-holding early, and better code quality, IMO.

Agents are fine if no human is going to work on the (sub)system going forward, and you only care about the shiny exterior without opening the hood to witness the horrors within.

◧◩◪
20. disgru+lS[view] [source] [discussion] 2025-05-15 10:47:05
>>tptace+Y8
> but many of the most common complaints about LLM coding (most notably hallucination) are essentially solved problems if you're using agents.

I have definitely not seen this in my experience (with Aider, Claude and Gemini). While helping me debug an issue, Gemini added a !/bin/sh line to the middle of the file (which appeared to break things), and despite having that code in the context didn't realise it was the issue.

OTOH, when asking for debugging advice in a chat window, I tend to get more useful answers, as opposed to a half-baked implementation that breaks other things. YMMV, as always.

◧◩
21. tom_m+mz1[view] [source] [discussion] 2025-05-15 15:53:25
>>lexand+W6
Cursor is pretty bad in my experience. I don't know why because I find Windsurf better and they both use Claude.

Regardless, Gemini 2.5 Pro is far far better and I use that with open-source free Roo Code. You can use the Gemini 2.5 Pro experimental model for free (rate limited) to get a completely free experience and taste for it.

Cursor was great and started is off, but others took notice and now they're all more or less the same. It comes down to UX and preference, but I think Windsurf and Roo Code just did a better job here than Cursor, personally.

[go to top]