Related: https://simonwillison.net/2026/Jan/27/one-human-one-agent-on...
After three days, I have it working with around 20K LOC, whereas ~14K is the browser engine itself + X11, then 6K is just Windows+macOS support.
Source code + CI built binaries are available here if you wanna try it out: https://github.com/embedding-shapes/one-agent-one-browser
Here's my own screenshot of it rendering my blog - https://bsky.app/profile/simonwillison.net/post/3mdg2oo6bms2... - it handles the layout and CSS gradiants really well, renders the SVG feed icon but fails to render a PNG image.
I thought "build a browser that renders HTML+CSS" was the perfect task for demonstrating a massively parallel agent setup because it couldn't be productively achieved in a few thousand lines of code by a single coding agent. Turns out I was wrong!
Just the day(s) before, I was thinking about this too, and I think what will make the biggest difference is humans who posses "Good Taste". I wrote a bunch about it here: https://emsh.cat/good-taste/
I think the ending is most apt, and where I think we're going wrong right now:
> I feel like we're building the wrong things. The whole vibe right now is "replace the human part" instead of "make better tools for the human part". I don't want a machine that replaces my taste, I want tools that help me use my taste better; see the cut faster, compare directions, compare architectural choices, find where I've missed things, catch when we're going into generics, and help me make sharper intentional choices.
I think the focus with LLM-assisted coding for me has been just that, assisted coding, not trying to replace whole people. It's still me and my ideas driving (and my "Good Taste", explained here: https://emsh.cat/good-taste/), the LLM do all the things I find more boring.
> prove that no reference implementation code leaked into the produced code
Hmm, yeah, I'm not 100% sure how to approach this, open to ideas. Basic comparing text feels like it'd be too dumb, using an LLM for it might work, letting it reference other codebase perhaps. Honestly, don't know how I'd do that.
> And finally, this being the work product of an AI process you can't claim copyright, but someone else could claim infringement so beware of that little loophole.
Good point to be aware of, and I guess I by instinct didn't actually add any license to this project. I thought of adding MIT as I usually do, but I didn't actually make any of this so ended up not assigning any license. Worst case scenario, I guess most jurisdictions would deem either no copyright or that I (implicitly) hold copyright. Guess we'll take that if we get there :)
https://github.com/LadybirdBrowser/ladybird/blob/master/CONT...
It's great to see him make this. I didn't know that he had a blog but looks good to me. Bookmarked now.
I feel like although Cursor burned 5 million$, we saw that and now Embedding shapes takeaway
If one person with one agent can produce equal or better results than "hundreds of agents for weeks", then the answer to the question: "Can we scale autonomous coding by throwing more agents at a problem?", probably has a more pessimistic answer than some expected.
Effectively to me this feels like answering the query which was being what if we have thousands of AI agents who can build a complex project autonomously with no Human. That idea seems dead now. Humans being in the loop will have a much higher productivity and end result.
I feel like the lure behind the Cursor project was to find if its able to replace humans completely in a extremely large project and the answer's right now no (and I have a feeling [bias?] that the answer's gonna stay that way)
Emsh I have a question tho, can you tell me about your background if possible? Have you been involved in browser development or any related endeavours or was this a first new one for you? From what I can feel/have talked with you, I do feel like the answer's yes that you have worked in browser space but I am still curious to know the answer.
A question which is coming to my mind is how much would be the difference between 1 expert human 1 agent and 1 (non expert) say Junior dev human 1 agent and 1 completely non expert say a normal person/less techie person 1 agent go?
What are your guys prediction on it?
How would the economics of becoming an "expert" or becoming a jack of all trades (junior dev) in a field fare with this new technology/toy that we got.
how much productivity gains could be from 1 non expert -> junior dev and the same question for junior -> senior dev in this particular context
[0] Cursor Is Lying To Developers… : https://www.youtube.com/watch?v=U7s_CaI93Mo
Also, someone made a similar comment not too long ago. So people surely are curious if this is possible. Kinda surprised this project's submission didn't got popular.
If you paste https://github.com/embedding-shapes/one-agent-one-browser into the "GitHub Repository" tab it estimates 4.58 person-years and $618,599 by year-2000 standards, or 5.61 years and $1,381,079 according to my very non-trustworthy 2025 estimate upgrade.
> I get to evaluate on stuff like links being consistently blue and underlined
Yeah, this browser doesn't have a "default stylesheet" like a regular browser. Probably should have added that, but was mostly just curious about rendering the websites from the web, rather than using what browsers think the web should look like.
> It may be that some of the rendering is not supported on windows- the back button certainly isn't.
Hmm, on Windows 11 the back button should definitively work, tried that just last night. Are you perhaps on Windows 10? I have not tried that myself, should work but might be why.
https://bsky.app/profile/emsh.cat/post/3mdgobfq4as2p
But basically I got curious and you can see from my other comments on you how much I love golang so decided to port the project from rust to golang and emsh predicts that the project's codebase can even shrink to 10k!
(although one point tho is that I don't have CC, I am trying it out on the recently released Kimi k2.5 model and their code but I decided to use that to see the real world use case of an open source model as well!)
Edit: I had written this comment just 2 minutes before you wrote but then I decided to write the golang project
I mean, I think I ate through all of my 200 queries in kimi code & it now does display me a (browser?) and I had the shell script as something to test your website as the test but it only opens up blank
I am gonna go sleep so that the 5 hour limits can get recharged again and I will continue this project.
I think it will be really interesting to see this project in golang, there must be good reason for emsh to say the project can be ~10k in golang.
> since getting the agent to self-select the right scope is usually the main bottleneck
I haven't found this to ever be the bottleneck, what agent and model are you using?
I kind of left the agents to do what they wanted just asking for a port.
Your website does look rotated and the image is the only thing visible in my golang port.
Let me open source it & I will probably try to hammer it some more after I wake up to see how good Kimi is in real world tasks.
https://github.com/SerJaimeLannister/golang-browser
I must admit that its not working right now and I am even unable to replicate your website that was able to first display even though really glitchy and image zoomed to now only a white although oops looks like I forgot the i in your name and wrote willson instead of willison as I wasn't wearing specs. Sorry about that
Now Let me see yeah now its displaying something which is extremely glitchy
https://github.com/SerJaimeLannister/golang-browser/blob/mai...
I have a file to show how glitchy it is. I mean If anything I just want someone to tinker around with if a golang project can reasonably be made out of this rust project.
Simon, I see that you were also interested in go vibe coding haha, this project has independent tests too! Perhaps you can try this out as well and see how it goes! It would be interesting to see stuff then!
Alright time for me to sleep now, good night!
The prompts themselves were basically "I'd like this website to render correct: https://medium.com, here's how it looks for me in Firefox with JavaScript turned off: [Image], figure out what features are missing, add them one-by-one, add regression texts and follow REQUIREMENTS.md and AGENTS.md closely" and various iterations/variations of that, so I didn't expressively ask it to implement specific CSS/HTML features, as far as I can remember. Maybe the first 2-3 prompts I did, I'll upload all the session files in a viewable way so everyone can see for themselves what exactly went on :)
https://tinyapps.org/network.html
Of course, "AI-generated browser is 1MB" is neither here nor there.
Yeah, that's obviously a lot harder, but doable. I've built it for clients, since they pay me, but haven't launch/made public something of my own, where I could share the code, I guess might be useful next project now.
> This is just, yet another, proof-of-concept.
It's not even a PoC, it's a demonstration of how far off the mark Cursor are with their "experiment" where they were amazed by what "hundreds of agents" build for week(s).
> there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject
This is absolutely true, I tried to get some better answers on how one could even figure that out here: >>46784990
That transcript viewer itself is a pretty fun novel piece of software, see https://github.com/simonw/claude-code-transcripts
Denobox https://github.com/simonw/denobox is another recent agent project which I consider novel: https://orphanhost.github.io/?simonw/denobox/transcripts/ses...
Unfortunately, this context is kind of implicit, I don't actually mention it in the blog post, which I probably should have done, that's my fault.
My comment on the cursor post for context: >>46625491
I placed some specifications + WPT into the repository the agent had access to! https://github.com/embedding-shapes/one-agent-one-browser/tr...
But judging by the session logs, it doesn't seem like the agent saw them, I never pointed it there, and seems none of the searches returned anything from there.
I'm slightly curious in doing it from scratch again, but this time explicitly point it to the specifications, and see if it gets better or worse.
- clear code structure and good architecture(modular approach reminiscent of Blitz but not as radical, like Blitz-lite).
- Very easy to follow the code and understand how the main render loop works:
- For Mac: main loop is at https://github.com/embedding-shapes/one-agent-one-browser/blob/master/src/platform/macos/windowed.rs#L74
- You can see clearly how UI events as passed to the App to handle.
- App::tick allows the app to handle internal events(Servoshell does something similar with `spin_event_loop` at https://github.com/servo/servo/blob/611f3ef1625f4972337c247521f3a1d65040bd56/components/servo/servo.rs#L176)
- If a redraw is needed, the main render logic is at https://github.com/embedding-shapes/one-agent-one-browser/blob/master/src/platform/macos/windowed.rs#L221 and calls into `render` of App, which computes a display list(layout) and then translates it into commands to the generic painter, which internally turns those into platform specific graphics operations.
- It's interesting how the painter for Mac uses Cocoa for graphics; very different from Servo which uses Webrender or Blitz which(in some path) uses Vello(itself using wgpu). I'd say using Cocoa like that might be closer to what React-Native does(expert to comfirm this pls?). Btw this kind of platform specific bindings is a strength of AI coding(and a real pain to do by hand).- Nice modularity between the platform and browser app parts achieved with the App and Painter traits.
How to improve it further? I'd say try to map how the architecture correspond to Web standards, such as https://html.spec.whatwg.org/multipage/webappapis.html#event...
Wouldn't have to be precise and comprehensive, but for example parts of App::tick could be documented as an initial attempt to implement a part of the web event-loop and `render` as an attempt at implementing the update-the-rendering task.
You could also split the web engine part from the app embedding it in a similar way to the current split between platform and app.
Far superior, and more cost effective, than the attempt at scaling autonomous agent coding pursued by Fastrender. Shows how the important part isn't how many agents you can run in parallel, but rather how good of an idea the human overseeing the project has(or rather: develops).