My AI skeptic friends are all nuts

>>tablet+(OP)
I suspect a large proportion of claims made for productivity increases are skewed by the fact that the speed at which code is produced by AI makes you _feel_ productive, but these gains are largely replaced by the effort to understand, refactor, review and clean up the code. The high that you get when something "works" tends to stick more in your memory than the time when you had to spend a day cleaning up dead code, refactoring 2k line modules into a more readable project structure etc.

I'm not saying that AI can't make you productive, it's just that these claims are really hard to verify. Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May), producing a single file with >2k lines. Is that going to be harder to maintain than a codebase with a proper project structure that's easily parseable by a human?

>>meande+my
This is cope. I know my own experience, and I know the backgrounds and problem domains of the friends I'm talking to that do this stuff better than I do. The productivity gains are real. They're also intuitive: if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.

Another thing I think people are missing is that serious LLM-using coders aren't expecting 100% success on prompts, or anything close to it. One of the skills you (rapidly) develop is the intuition for when to stop a runaway agent.

If an intern spun off hopelessly on a task, it'd be somewhat problematic, because there are finite intern hours and they're expensive. But failed agent prompts are nickel-denominated.

We had a post on the front page last week about someone doing vulnerability research with an LLM. They isolated some target code and wrote a prompt. Then they ran it one hundred times (preemptively!) and sifted the output. That approach finds new kernel vulnerabilities!

Ordinary developers won't do anything like that, but they will get used to the idea of only 2/3 of prompts ending up with something they merge.

Another problem I think a lot of skeptics are running into: stop sitting there staring at the chain of thought logs.

zlacker