zlacker

I'd encourage you to try the -codex family with the highest reasoning.

I can't comment on Opus in CC because I've never bit the bullet and paid the subscription, but I have worked my way up to the $200/month Cursor subscription and the 5.2 codex models blow Opus out of the water in my experience (obviously very subjective).

I arrived at making plans with Opus and then implementing with the OpenAI model. The speed of Opus is much better for planning.

I'm willing to believe that CC/Opus is truly the overall best; I'm only commenting because you mentioned Cursor, where I'm fairly confident it's not. I'm basing my judgement on "how frequently does it do what I want the first time".

replies(2): >>eadwu+M2 >>skapad+de

>>vercae+(OP)
I've tried nearly all the models, they all work best if and only if you will never handle the code ever again. They suck if you have a solution and want them to implement that solution.

I've tried explaining the implementation word and word and it still prefers to create a whole new implementation reimplementing some parts instead of just doing what I tell it to. The only time it works is if I actually give it the code but at that point there's no reason to use it.

There's nothing wrong with this approach if it actually had guarantees, but current models are an extremely bad fit for it.

replies(2): >>teaear+Z3 >>vercae+K4

>>eadwu+M2
There are domains of programming (web front end) where lots of requests can be done pretty well even when you want them done a certain way. Not all, but enough to make it a great tool.

>>eadwu+M2
Yes, I only plan/implement on fully AI projects where it's easy for me to tell whether or not they're doing the thing I want regardless of whether or not they've rewritten the codebase.

For actual work that I bill for, I go in with intructions to do minimal changes, and then I carefully review/edit everything.

That being said, the "toy" fully-AI projects I work with have evolved to the point where I regularly accomplish things I never (never ever) would have without the models.

>>vercae+(OP)
Thanks, I'll try those out. I've used Codex CLI itself on a few small projects as well, and fired it up on a feature branch where I had it implement the same feature that Claude Code did (they didn't see each other's implementations). For that specific case, the implementation Codex produced was simpler, and better for the immediate requirements. However, Claude's more abstracted solution may have held up better to changing requirements. Codex feels more reserved than Claude Code, which can be good or bad depending on the task.

replies(1): >>vercae+DR2

>>skapad+de
This makes a lot of sense to me.

I've heard Codex CLI called a scalpel, and this resonates. You wouldn't use a scalpel for a major carving project.

To come back to my earlier comment, though, my main approach makes sense in this context. I let Opus do the abstract thinking, and then OpenAI's models handle the fine details.

On a side note, I've also spent a fair amount of time messing around around in Codex CLI as I have a Pro subscription. It rapidly becomes apparent that it does exactly what you tell it even if an obvious improvement is trivial. Opus is on the other end of the spectrum here. you have to be fairly explicit with Opus intructing it to not add spurious improvements.

replies(1): >>skapad+l74

>>vercae+DR2
"To come back to my earlier comment, though, my main approach makes sense in this context. I let Opus do the abstract thinking, and then OpenAI's models handle the fine details."

Very interesting. I'm going to try this out. Thanks!