zlacker

I mostly ignore code, I lean on specs + tests + static analysis. I spot check tests depending on how likely I think it is for the agent to have messed up or misinterpreted my instructions. I push very high test coverage on all my projects (85%+), and part of the way I build is "testing ladders" where I have the agent create progressively bigger integration tests, until I hit e2e/manual validation.

replies(2): >>kace91+52 >>strayd+W3

>>Curiou+(OP)
>I spot check tests depending on how likely I think it is for the agent to have messed up or misinterpreted my instructions

So a percentage of your code, based on your gut feeling, is left unseen by any human by the moment you submit it.

Do you agree that this rises the chance of bugs slipping by? I don’t see how you wouldn’t.

And considering the fact that your code output is larger, the percentage of it that is buggy is larger, and (presumably) you write faster, have you considered the conclusion in terms of the compounding likelihood of incidents?

replies(1): >>Curiou+8l

>>Curiou+(OP)
"Testing ladders" is a great framing.

My approach is similar. I invest in the harness layer (tests, hooks, linting, pre-commit checks). The code review happens, it's just happening through tooling rather than my eyeballs.

>>kace91+52
There's definitely a class of bugs that are a lot more common, where the code deviates from the intent in some subtle way, while still being functional. I deal with this using benchmarking and heavy dogfooding, both of these really expose errors/rough edges well.