CEO stated "We built a browser with GPT-5.2 in Cursor"
instead of
"by dividing agents into planners and workers we managed to get them busy for weeks creating thousands of commits to the main branch, resolving merge conflicts along the way. The repo is 1M+ lines of code but the code does not work (yet)"
[0] https://cursor.com/blog/scaling-agents
[1] https://x.com/kimmonismus/status/2011776630440558799
[2] https://x.com/mntruell/status/2011562190286045552
[3]https://www.reddit.com/r/singularity/comments/1qd541a/ceo_of...
If you view the PRs, they bundle multiple fixes together, at least according to the commit messages. The next hurdle will be to guardrail agents so that they only implement one task and don't cheat by modifying the CI piepeline
So agents will actually be able to build a {browser, library, etc} that won't be an absolute slopfest, but the real crucial question is when. You need better and more efficient RL training, further scaling (Amodei thinks really scaling is the only thing you technically need here and we have about 3-4 orders of magnitude of headroom left before we hit insurmountable limits), bigger context windows (that models actually handle well) and possibly continual learning paradigms, but solutions to these problems are quite tangible now.