===
"Fix Claude's bug manually. Claude had a bug in the previous commit. I prompted it multiple times to fix the bug but it kept doing the wrong thing.
So this change is manually written by a human.
I also extended the README to discuss the OAuth 2.1 spec problem."
===
This is super relatable to my experience trying to use these AI tools. They can get halfway there and then struggle immensely.
Restart the conversation from scratch. As soon as you get something incorrect, begin from the beginning.
It seems to me like any mistake in a messages chain/conversation instantly poisons the output afterwards, even if you try to "correct" it.
So if something was wrong at one point, you need to go back to the initial message, and adjust it to clarify the prompt enough so it doesn't make that same mistake again, and regenerate the conversation from there on.
I haven't used Anthropic's models/software in a long time (months, basically forever in AI ecosystem), so don't know exactly how it works now.
But last time I used Claude, you could edit the first message, and then re-generate the assistants next message based on your edit. Most of the LLM interfaces has one or another way of doing this, I can't imagine they got rid of that feature.
What I'm suggesting isn't to use the exact same input (the first message), but rather change it so you remove the chances of something incorrect happening later after that.