The Codex app illustrates the shift left of IDEs and coding GUIs

>>strayd+(OP)
>The people really leading AI coding right now (and I’d put myself near the front, though not all the way there) don’t read code. They manage the things that produce code.

I can’t imagine any other example where people voluntarily move for a black box approach.

Imagine taking a picture on autoshot mode and refusing to look at it. If the client doesn’t like it because it’s too bright, tweak the settings and shoot again, but never look at the output.

What is the logic here? Because if you can read code, I can’t imagine poking the result with black box testing being faster.

Are these people just handing off the review process to others? Are they unable to read code and hiding it? Why would you handicap yourself this way?

>>kace91+Ek
> Imagine taking a picture on autoshot mode and refusing to look at it. If the client doesn’t like it because it’s too bright, tweak the settings and shoot again, but never look at the output.

The output of code isn't just the code itself, it's the product. The code is a means to an end.

So the proper analogy isn't the photographer not looking at the photos, it's the photographer not looking at what's going on under the hood to produce the photos. Which, of course, is perfectly common and normal.

>>csalle+xn
>The output of code isn't just the code itself, it's the product. The code is a means to an end.

I’ll bite. Is this person manually testing everything that one would regularly unit test? Or writing black box tests that he does know are correct because of being manually written?

If not, you’re not reviewing the product either. If yes, it’s less time consuming to actually read and test the damn code

>>kace91+Mp
I mostly ignore code, I lean on specs + tests + static analysis. I spot check tests depending on how likely I think it is for the agent to have messed up or misinterpreted my instructions. I push very high test coverage on all my projects (85%+), and part of the way I build is "testing ladders" where I have the agent create progressively bigger integration tests, until I hit e2e/manual validation.

>>Curiou+xr
>I spot check tests depending on how likely I think it is for the agent to have messed up or misinterpreted my instructions

So a percentage of your code, based on your gut feeling, is left unseen by any human by the moment you submit it.

Do you agree that this rises the chance of bugs slipping by? I don’t see how you wouldn’t.

And considering the fact that your code output is larger, the percentage of it that is buggy is larger, and (presumably) you write faster, have you considered the conclusion in terms of the compounding likelihood of incidents?

zlacker