zlacker

>I've seen Claude hallucinate running test suites before.

This reminded of something that happened to me last year. Not Claude (I think it was GPT 4.0 maybe?), but I had it running in VS Code's Copilot and asked it to fix a bug then add a test for the case.

Well, it kept failing to pass its own test, so on the third try, it sat there "thinking" for a moment, then finally spit out the command `echo "Test Passed!"`, executed it, read it from the terminal, and said it was done.

I was almost impressed by the gumption more than anything.

replies(1): >>Merad+p71

>>djeast+(OP)
I've been using Claude Code with Opus 4.5 a lot the last several months and while it's amazingly capable it has a huge tendency to give up on tests. It will just decide that it can commit a failing test because "fixing it has been deferred" or "it's a pre-existing problem." It also knows that it can use `HUSKY=0 git commit ...` to bypass tests that are run in commit hooks. This is all with CLAUDE.md being very specific that every commit must have passing tests, lint, etc. I eventually had to add a Claude Code pre-command hook (which it can't bypass) to block it from running git commit if it isn't following the rules.

replies(1): >>theshr+Fn3

>>Merad+p71
Anecdata from the internet has a few stories of Claude Opus bypassing hooks too =)

1) it wants to run X command

2) it notices a hook preventing it from running X

3) it creates a Python application or shell script that does X and runs it instead

Whoops.

replies(1): >>Merad+cr5

>>theshr+Fn3
I haven't seen it bypass my hook yet (knock on wood). I have my hook script [0] tell that its commits are required to pass validation, maybe that helps push it in the right direction?

0: https://github.com/mbcrawfo/vibefun/blob/main/.claude/hooks/...