Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC

>>embedd+(OP)
This is not that impressive, there are numerous examples of browsers for training data to reference.

>>deadba+wq2
I don't buy this.

It implies that the agents could only do this because they could regurgitate previous browsers from their training data.

Anyone who's watched a coding agent work will see why that's unlikely to be what's happening. If that's all they were doing, why did it take three days and thousands of changes and tool calls to get to a working result?

I also know that AI labs treat regurgitation of training data as a bug and invest a lot of effort into making it unlikely to happen.

I recommend avoiding the temptation to look at things like this and say "yeah, that's not impressive, it saw that in the training data already". It's not a useful mental model to hold.

>>simonw+9w2
It took three days because... agents suck.

But yes, with enough prodding they will eventually build you something that's been built before. Don't see why that's particularly impressive. It's in the training data.

>>deadba+XV2
Not a useful mental model.

>>simonw+923
It is useful. If you can whip up something complex fairly quickly with an AI agent, it’s likely because it’s already been done before.

But if even the AI agent seems to struggle, you may be doing something unprecedented.

>>deadba+t33
Except if you spend quality time with coding agents you realize that's not actually true.

They're equally useful for novel tasks because they don't work by copying large scale patterns from their training data - the recent models can break down virtually any programming task to a bunch of functions and components and cobble together working code.

If you can clearly define the task, they can work towards a solution with you.

The main benefit of concepts already in the training data is that it lets you slack off on clearly defining the task. At that point it's not the model "cheating", it's you.

>>simonw+Q53
Simon, do you happen to have some concrete examples of a model doing a great job at a clearly novel, clearly non-trivial coding task?

I'd find it very interesting to see some compelling examples along those line.

>>aix1+nb3
I think datasette-transactions https://github.com/datasette/datasette-transactions is pretty novel. Here's the transcript where Claude Code built it: https://gisthost.github.io/?a41ce6304367e2ced59cd237c576b817...

That transcript viewer itself is a pretty fun novel piece of software, see https://github.com/simonw/claude-code-transcripts

Denobox https://github.com/simonw/denobox is another recent agent project which I consider novel: https://orphanhost.github.io/?simonw/denobox/transcripts/ses...

zlacker