The implied future here is _unreal cool_. Swarms of coding agents that can build anything, with little oversight. Long-running projects that converge on high-quality, complex projects.
But the examples feel thin. Web browsers, Excel, and Windows 7 exist, and they specifically exist in the LLM's training sets. The closest to real code is what they've done with Cursor's codebase .... but it's not merged yet.
I don't want to say, call me when it's merged. But I'm not worried about agents ability to produce millions of lines of code. I'm worried about their ability to intersect with the humans in the real world, both as users of that code and developers who want to build on top of it.
But what would be the point of re-creating existing applications? It would be useful if you can produce a better version of those applications. But the point in this experiment was to produce something "from scratch" I think. Impressive yes, but is it useful?
A more practically useful task would be for Mozilla Foundation and others to ask AI to fix all bugs in their application(s). And perhaps they are trying to do that, let's wait and see.
because it is absolutely impossible to review that code and there is gazillion issues there.
The only way it can get merged is YOLO and then fix issues for months in prod which kinda defeats the purpose and brings gains close to zero.
This is how I think about it. I care about asymptotics. What initial conditions (model(s) x workflow/harness x input text artefacts) causes convergence to the best steady state? The number of lines of code doesn't have to grow, it could also shrink. It's about the best output.
In my experience agents don't converge on anything. They diverge into low-quality monstrosities which at some point become entirely unusable.
There's just a bit over 3 browsers, 1 serious excel-like and small part of windows user side. That's really not enough for training for replicating those specific tasks.
I would go even further, why have they not created at least one less complex project that is working and ready to be checked out? To me it sounds like having a carrot dangle in front of the face of VC investors: 'Look, we are almost there to replace legions of software developers! Imagine the market size and potential cost reductions for companies.'
LLMs are definitely an exciting new tool and they are going to change a lot. But are they worth $B for everything being stamped 'AI'? The future will tell. Looking back the dotcom boom hype felt exactly the same.
The difference with the dotcom boom is that at the time there was a lot more optimism to build a better future. The AI gold rush seems to be focused on getting giga-rich while fscking the bigger part of humanity.