I'm not saying that AI can't make you productive, it's just that these claims are really hard to verify. Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May), producing a single file with >2k lines. Is that going to be harder to maintain than a codebase with a proper project structure that's easily parseable by a human?
Another thing I think people are missing is that serious LLM-using coders aren't expecting 100% success on prompts, or anything close to it. One of the skills you (rapidly) develop is the intuition for when to stop a runaway agent.
If an intern spun off hopelessly on a task, it'd be somewhat problematic, because there are finite intern hours and they're expensive. But failed agent prompts are nickel-denominated.
We had a post on the front page last week about someone doing vulnerability research with an LLM. They isolated some target code and wrote a prompt. Then they ran it one hundred times (preemptively!) and sifted the output. That approach finds new kernel vulnerabilities!
Ordinary developers won't do anything like that, but they will get used to the idea of only 2/3 of prompts ending up with something they merge.
Another problem I think a lot of skeptics are running into: stop sitting there staring at the chain of thought logs.