Anyway, please try it if you find it unbelievable. I didn't expect it to work FWIW like it did. Opus 4.5 is pretty amazing at long running tasks like this.
Maybe you did one or the other , but “nearly one-shotted” doesn’t tend to mean that.
Claude Code more than occasionally likes to make weird assumptions, and it’s well known that it hallucinates quite a bit more near the context length, and that compaction only partially helps this issue.
I have no idea why it had so much trouble with this generally easy task. Bizarre.
Sure, maybe that’s just building something that’s bug-for-bug compatible, but it’s something Claude can work with.
This reminded of something that happened to me last year. Not Claude (I think it was GPT 4.0 maybe?), but I had it running in VS Code's Copilot and asked it to fix a bug then add a test for the case.
Well, it kept failing to pass its own test, so on the third try, it sat there "thinking" for a moment, then finally spit out the command `echo "Test Passed!"`, executed it, read it from the terminal, and said it was done.
I was almost impressed by the gumption more than anything.
1) it wants to run X command
2) it notices a hook preventing it from running X
3) it creates a Python application or shell script that does X and runs it instead
Whoops.
0: https://github.com/mbcrawfo/vibefun/blob/main/.claude/hooks/...