zlacker

What kind of time frame do you ballpark this would have taken you on your own?

I know it's a little apples-and-oranges (you and the agent wouldn't produce the exact same thing), but I'm not asking because I'm interested in the man-hour savings. Rather, I want to get a perspective on what kind of expertise went into the guidance (without having to read all the guidance and be familiar with browser implementation myself). "How long this would have taken the author" seems like one possible proxy for "how much pre-existing experience went into this agent's guidance".

replies(2): >>simonw+22 >>embedd+34

>>happyt+(OP)
I have a fun little tool which runs the year-2000-era sloccount algorithm (which is Perl and C so I run it in WebAssembly) to estimate the time and cost of a project here: https://tools.simonwillison.net/sloccount

If you paste https://github.com/embedding-shapes/one-agent-one-browser into the "GitHub Repository" tab it estimates 4.58 person-years and $618,599 by year-2000 standards, or 5.61 years and $1,381,079 according to my very non-trustworthy 2025 estimate upgrade.

replies(1): >>pizlon+k21

>>happyt+(OP)
> What kind of time frame do you ballpark this would have taken you on your own?

I don't think I'd be able to do this on my own. Not that I don't know Rust, but because I don't know X11 (nor macOS or Windows) well enough to even know where to begin.

I've been a Linux user for almost two decades, so I know my way around my system, but never developed X11 applications or anything, I'm mostly a web developer who jumped around various roles through the years. Spent a lot of time caring deeply about testing, infrastructure, architecture/design and communication between humans, might have given me a slight edge in programming together with agents.

replies(1): >>happyt+Ew

>>embedd+34
Hmm, well I'm more interested in the browser part rather than the windowing part - I feel like it makes more sense that LLMs can be somewhat competent with windowing frameworks even if the prompter is not super experienced. Regardless, there's probably not a concise way to get what I'm looking for - instead, I'm looking forward to seeing your config/input! I'm super curious.

replies(1): >>embedd+BC

>>happyt+Ew
Ah :) On the browser part, I've spent huge chunks of time inside of the browser viewport as a frontend engineer, also as a backend engineer and finally managing infrastructure, but never much inside browser internals and painting, layouting and that sort of stuff. I wouldn't even say that frontend performance (re trashing, re-calculating layouts, etc) is my forte, mostly been focusing on being able to mold codebases into something that doesn't turn into spaghetti after a year of various developers working on it.

The prompts themselves were basically "I'd like this website to render correct: https://medium.com, here's how it looks for me in Firefox with JavaScript turned off: [Image], figure out what features are missing, add them one-by-one, add regression texts and follow REQUIREMENTS.md and AGENTS.md closely" and various iterations/variations of that, so I didn't expressively ask it to implement specific CSS/HTML features, as far as I can remember. Maybe the first 2-3 prompts I did, I'll upload all the session files in a viewable way so everyone can see for themselves what exactly went on :)

>>simonw+22
I pasted a subset of the Fil-C source code into your tool and it says 6 person years. I just pasted the compiler pass and the obvious parts of the runtime.

Note that I started the project in Nov 2023 and can only work on it maybe 1-2 hours a day because it's just a side project.

So I think your tool either estimates based on very bad programmers, or it's just wrong. Or maybe 10x programmers are real and I am him

replies(2): >>simonw+9b1 >>lifthr+cE1

>>pizlon+k21
Here's more about the COCOMO model it uses: https://dwheeler.com/sloccount/sloccount.html#cocomo

replies(1): >>pizlon+4c1

>>simonw+9b1
Sounds like nonsensical pseudoscience

replies(1): >>simonw+ky2

>>pizlon+k21
These metrics necessarily have to underestimate programmer skills because those are not directly controllable. If there is any sort of rigor in these metrics (i.e. I don't know if COCOMO is one of them) they will probably assume, say, a mundane programmer whose performance is worse than 90/95/99% of all other programmers.

>>pizlon+4c1
I don't take those results very seriously myself, but have you seen anything better?

replies(1): >>pizlon+sc3

>>simonw+ky2
No

To me this is a case where knowing that you don't have data is better than having data and pretending it means anything