zlacker

Cursor's latest “browser experiment” implied success without evidence

submitted by embedd+(OP) on 2026-01-16 14:37:49 | 724 points 309 comments
[view article] [source] [go to bottom]

Related: Scaling long-running autonomous coding - https://news.ycombinator.com/item?id=46624541 - Jan 2026 (174 comments)


NOTE: showing posts with links only show all posts
1. embedd+c4[view] [source] 2026-01-16 15:02:17
>>embedd+(OP)
I'm eager to find out if this was actually successfully compiled at one point (otherwise how did they get the screenshots?), so I'm running `cargo check` for each of the last 100 commits to see if anything works. Will update here with the results once it's ready.

Edit: As mentioned, I ran `cargo check` on all the last 100 commits, and seems every single of them failed in some way: https://gist.github.com/embedding-shapes/f5d096dd10be44ff82b...

5. paulus+0w[view] [source] 2026-01-16 17:04:21
>>embedd+(OP)
The blog[0] is worded rather conservatively but on Twitter [2] the claim is pretty obvious and the hype effect is achieved [2]

CEO stated "We built a browser with GPT-5.2 in Cursor"

instead of

"by dividing agents into planners and workers we managed to get them busy for weeks creating thousands of commits to the main branch, resolving merge conflicts along the way. The repo is 1M+ lines of code but the code does not work (yet)"

[0] https://cursor.com/blog/scaling-agents

[1] https://x.com/kimmonismus/status/2011776630440558799

[2] https://x.com/mntruell/status/2011562190286045552

[3]https://www.reddit.com/r/singularity/comments/1qd541a/ceo_of...

◧◩◪
14. emp173+Cx[view] [source] [discussion] 2026-01-16 17:11:44
>>embedd+4x
Take a look at this thread regarding the original claim: >>46624541

The top comment is indeed baseless hype without a hint of skepticism.

◧◩◪◨
16. embedd+Qz[view] [source] [discussion] 2026-01-16 17:22:02
>>emp173+Cx
The second top comment is my own (skeptical) comment, with 20 points at this moment. Thanks to those 20 people, I felt compelled to write the blog-post in this submission, and try to ask a bit clearer "what is going on?", since apparently we're at least 20 people who is wondering about this.

There is also clearly a lot of other skeptical people in that submission too. Also, simonw (from that top comment) told me themselves "it's not clear that what they built even runs": https://bsky.app/profile/simonwillison.net/post/3mckgw4mxoc2...

20. nindal+BA[view] [source] 2026-01-16 17:25:56
>>embedd+(OP)
The CEO said

> It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.

"From scratch" sounds very impressive. "custom JS VM" is as well. So let's take a look at the dependencies [1], where we find

- html5ever

- cssparser

- rquickjs

That's just servo [2], a Rust based browser initially built by Mozilla (and now maintained by Igalia [3]) but with extra steps. So this supposed "from scratch" browser is just calling out to code written by humans. And after all that it doesn't even compile! It's just plain slop.

[1] - https://github.com/wilsonzlin/fastrender/blob/main/Cargo.tom...

[2] - https://github.com/servo/servo

[3] - https://blogs.igalia.com/mrego/servo-2025-stats/

41. deng+6J[view] [source] 2026-01-16 18:02:06
>>embedd+(OP)
If you look at the original Cursor post, they say they are currently running similar experiments, for instance, this Excel clone:

https://github.com/wilson-anysphere/formula

The Actions overview is impressive: There have been 160,469 workflow runs, of which 247 succeeded. The reason the workflows are failing is because they have exceeded their spending limit. Of course, the agents couldn't care less.

◧◩◪◨⬒
56. simonw+F31[view] [source] [discussion] 2026-01-16 19:21:16
>>blibbl+RE
See comment here: https://news.ycombinator.com/item?id=46646777#46650837

I do not think you are reacting to what I said in good faith.

> he better hope he's on the right side of history here, as otherwise he will have burnt his reputation

That's something I've actually given quite a lot of thought to. My reputation and credibility matters a great deal to me. If it turns out this entire LLM thing was an over-hyped scam I'll take a very big hit to that reputation, and I'll deserve it.

(If AI rises up and tries to kill or enslave us all I'll be too busy fighting back to care.)

◧◩◪
58. nicobu+e51[view] [source] [discussion] 2026-01-16 19:28:17
>>embedd+pw
Somebody managed to get it to compile https://x.com/CanadaHonk/status/2011612084719796272

But apparently "some pages take a literal minute to load"

65. pavlov+j91[view] [source] 2026-01-16 19:45:33
>>embedd+(OP)
The comment that points out that this week-long experiment produced nothing more than a non-functional wrapper for Servo (an existing Rust browser) should be at the top:

>>46649046

◧◩
66. embedd+l91[view] [source] [discussion] 2026-01-16 19:45:36
>>Snuggl+X51
Yeah, seems latest commit does let `cargo check` successfully run. I'm gonna write an update blog post once they've made their statement, because I'm guessing they're about to say something.

Sometime fishy is happening in their `git log`, it doesn't seem like it was the agents who "autonomously" actually made things compile in the end. Notice the git username and email addresses switching around, even some commits made inside a EC2 instance managed to get in there: https://gist.github.com/embedding-shapes/d09225180ea3236f180...

◧◩
68. leerob+yb1[view] [source] [discussion] 2026-01-16 19:55:43
>>embedd+c4
Should compile now: >>46650998
◧◩
69. leerob+Hb1[view] [source] [discussion] 2026-01-16 19:56:20
>>nindal+BA
> The JS engine used a custom JS VM being developed in vendor/ecma-rs as part of the browser, which is a copy of my personal JS parser project vendored to make it easier to commit to.

>>46650998

◧◩
74. svieir+we1[view] [source] [discussion] 2026-01-16 20:10:13
>>ryanis+Gd1
> What they all seem to be just glossing over is how the project unfolded: without human intervention, using computers, in an exceptionally accelerated time frame, working 24hr/day.

Correct, but Gas Town [1] already happened and what's more _actually worked_, so this experiment is both useless (because it doesn't demonstrate working software) _and_ derivative (because we've already seen that you can set up a project where with spend similar to the spend of a single developer you can churn out more code than any human could read in a week).

[1]: https://github.com/steveyegge/gastown

◧◩◪
75. embedd+ze1[view] [source] [discussion] 2026-01-16 20:10:32
>>leerob+yb1
> Yeah, seems latest commit does let `cargo check` successfully run. I'm gonna write an update blog post once they've made their statement, because I'm guessing they're about to say something.

> Sometime fishy is happening in their `git log`, it doesn't seem like it was the agents who "autonomously" actually made things compile in the end. Notice the git username and email addresses switching around, even a commit made inside a EC2 instance managed to get in there: https://gist.github.com/embedding-shapes/d09225180ea3236f180...

Gonna need to look closer into it when I have time, but seems they manually patched it up in the end, so the original claim still doesn't stand :/

◧◩◪
111. benhoy+Ru1[view] [source] [discussion] 2026-01-16 21:35:44
>>pera+gl1
Not me personally, but a GitHub user wrote a replacement for Go's regexp library that was "up to 3-3000x+ faster than stdlib": https://github.com/coregx/coregex ... at first I was impressed, so started testing it and reporting bugs, but as soon as I ran my own benchmarks, it all fell apart (https://github.com/coregx/coregex/issues/29). After some mostly-bot updates, that issue was closed. But someone else opened a very similar one recently (https://github.com/coregx/coregex/issues/79) -- same deal, "actually, it's slower than the stdlib in my tests". Basically AI slop with poor tests, poor benchmarks, and way oversold. How he's positioning these projects is the problematic bit, I reckon, not the use of AI.

Same user did a similar thing by creating an AWK interpreter written in Go using LLMs: https://github.com/kolkov/uawk -- as the creator of (I think?) the only AWK interpreter written in Go (https://github.com/benhoyt/goawk), I was curious. It turns out that if there's only one item in the training data (GoAWK), AI likes to copy and paste freely from the original. But again, it's poorly tested and poorly benchmarked.

I just don't see how one can get quality like this, without being realistic about code review, testing, and benchmarking.

◧◩◪◨⬒⬓
112. logica+5v1[view] [source] [discussion] 2026-01-16 21:36:34
>>nyeah+MI
It's implied by the fact that early in the post they say:

>"To test this system, we pointed it at an ambitious goal: building a web browser from scratch."

and then near the end, they say:

>"Hundreds of agents can work together on a single codebase for weeks, making real progress on ambitious projects."

This means they only make progress toward it, but do not "build a web browser from scratch".

If you're curious, the State of Utopia (will be available at https://stateofutopia.com ) did build a web browser from scratch, though it used several packages for the networking portion of it.

See my other comments and posts for links.

◧◩◪
113. gorkae+Xv1[view] [source] [discussion] 2026-01-16 21:40:52
>>pera+gl1
I think it's fair enough to consider porting a subset of rewriting, in which case there are several successful experiments out there:

- JustHTML [1], which in practice [2] is a port of html5ever [3] to Python.

- justjshtml, which is a port of JustHTML to JavaScript :D [4].

- MiniJinja [5] was recently ported to Go [6].

All three projects have one thing in common: comprehensive test suites which were used to guardrail and guide AI.

References:

1. https://github.com/EmilStenstrom/justhtml

2. https://friendlybit.com/python/writing-justhtml-with-coding-...

3. https://github.com/servo/html5ever

4. https://simonwillison.net/2025/Dec/15/porting-justhtml/

5. https://github.com/mitsuhiko/minijinja

6. https://lucumr.pocoo.org/2026/1/14/minijinja-go-port/

◧◩
130. AstroB+MC1[view] [source] [discussion] 2026-01-16 22:24:31
>>pavlov+j91
Apparebtly this person actually got it to compile: https://xcancel.com/CanadaHonk/status/2011612084719796272#m
◧◩◪
138. observ+oG1[view] [source] [discussion] 2026-01-16 22:46:58
>>AstroB+MC1
https://x.com/CanadaHonk/status/2011612084719796272 as well.

I went through the motions. There are various points in the repo history where compilation is possible, but it's obscure. They got it to compile and operate prior to the article, but several of the PRs since that point broke everything, and this guy went through the effort of fixing it. I'm pretty sure you can just identify the last working commit and pull the version from there, but working out when looks like a big pain in the butt for a proof of concept.

◧◩◪◨
140. embedd+CJ1[view] [source] [discussion] 2026-01-16 23:11:46
>>observ+oG1
> but several of the PRs since that point broke everything, and this guy went through the effort of fixing it. I'm pretty sure you can just identify the last working commit and pull the version from there, but working out when looks like a big pain in the butt for a proof of concept.

I went through the last 100 commits (>>46647037 ) and nothing there was working (yet/since). Seems now after a developer corrected something it managed to pass `cargo check` without errors, since commit 526e0846151b47cc9f4fcedcc1aeee3cca5792c1 (Jan 16 02:15:02 2026 -0800)

◧◩◪
158. afishh+BT1[view] [source] [discussion] 2026-01-17 00:30:56
>>wmf+Oo1
It seemingly did but after I saw it define a VerticalAlign twice in different files[1][2][3] I concluded that it's probably not coherent enough to waste time on checking the correctness.

Would be interesting if someone who has managed to run it tries it on some actually complicated text layout edge cases (like RTL breaking that splits a ligature necessitating re-shaping, also add some right-padding in there to spice things up).

[1] https://github.com/wilsonzlin/fastrender/blob/main/src/layou...

[2] https://github.com/wilsonzlin/fastrender/blob/main/src/layou...

[3] Neither being the right place for defining a struct that should go into computed style imo.

◧◩◪◨
172. MrJohz+A32[view] [source] [discussion] 2026-01-17 02:22:14
>>gorkae+Xv1
Note that it's not clear that any of the JustHTML ports were actually ports per se, as in the end they all ended up with very different implementations. Instead, it might just be that an LLM generated roughly the same library several different times.

See https://felix.dognebula.com/art/html-parsers-in-portland.htm...

194. wilson+ri2[view] [source] 2026-01-17 05:37:53
>>embedd+(OP)
Hey, Wilson here, author of the blog post and the engineer working on this project. I've been reading the responses here and appreciate the feedback. I've posted some follow up context on Twitter/X[0], which I'll also write here:

The repo is a live incubator for the harness. We are actively researching the behavior of collaborative long running agents, and may in the future make the browser and other products this research produces more consumable by end users and developers, but it's not the goal for now. We made it public as we were excited by the early results and wanted to share; while far off from feature parity with the most popular production browsers today, we think it has made impressive progress in the last <1 week of wall time.

Given the interest in trying out the current state of the project, I've merged a more up-to-date snapshot of the system's progress that resolves issues with builds and CI. The experimental harness can occasionally leave the repo in an incomplete state but does converge, which was the case at the time of the post.

I'm here to answer any further questions you have.

[0] https://x.com/wilsonzlin/status/2012398625394221537?s=20

◧◩
199. potami+Yj2[view] [source] [discussion] 2026-01-17 05:58:40
>>Matthy+BJ
Check out the list of all CSS specifications [1], and then open any one of them and see how lengthy and elaborate each is. Then do the same for each version of the spec published over the last thirty years. Before you can start, you must read and understand all of this at a great level of depth. Still, specifications never tell the complete story. You must be aware of all the nuances that are implied by each requirement in the spec and know how to handle the zillion corner cases that will crop up inevitably.

And this is just one part. Not even considering the fully sandboxed, mini operating system for running webapps.

[1] https://www.w3.org/Style/CSS/specs.en.html

◧◩◪◨⬒
210. supriy+Oo2[view] [source] [discussion] 2026-01-17 07:18:19
>>dragon+Zh2
Reminds me of https://xkcd.com/870/
◧◩◪
216. oefrha+Zt2[view] [source] [discussion] 2026-01-17 08:41:04
>>M4v3R+Do2
Yeah there's more to a browser than a couple of out-of-tree servo components, otherwise https://github.com/servo/servo wouldn't have 300k+ lines of Rust code, 400k+ if you count comments and blanks (I cloned the repo, nuked the tests directory, then did a count).

Plus that linked comment doesn't even say it's "nothing more than a non-functional wrapper for Servo". It disputes the "from scratch" claim.

Most people aren't interested in a nuanced take though. Someone said something plausible sounding and was voted to top by other people? Good enough for me, have another vote. Then twist and exaggerate a little and post it to another comment section. Get more votes. Rinse and repeat.

◧◩
217. wilson+Bu2[view] [source] [discussion] 2026-01-17 08:46:43
>>nindal+BA
Thanks for the feedback. I've addressed similar feedback at [0] and provided some more context at [1].

I do want to briefly note that the JS VM is custom and not QuickJS. It also implemented subsystems like the DOM, CSS cascade, inline/block/table layouts, paint systems, text pipeline, and chrome, and I'd push back against the assertion that it merely calls out to external code. I addressed these points in more detail at [0].

[0] >>46650998 [1] >>46655608

◧◩
220. wilson+2w2[view] [source] [discussion] 2026-01-17 09:00:56
>>pavlov+j91
I've responded to this claim in more detail at [0], with additional context at [1].

Briefly, the project implemented substantial components, including a JS VM, DOM, CSS cascade, inline/block/table layout, paint systems, text pipeline, and chrome, and is not merely a Servo wrapper.

[0] >>46650998

[1] >>46655608

◧◩◪
228. pera+Lz2[view] [source] [discussion] 2026-01-17 09:39:50
>>wilson+2w2
Just for context, this was the original claim by Cursor's CEO on Twitter:

> We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.

> It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.

> It kind of works! It still has issues and is of course very far from Webkit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.

https://xcancel.com/mntruell/status/2011562190286045552#m

◧◩◪
238. nindal+pD2[view] [source] [discussion] 2026-01-17 10:29:02
>>wilson+Bu2
> I do want to briefly note that the JS VM is custom and not QuickJS

It's hard to verify because your project didn't actually compile. But now that you've fixed the compilation manually, can you demonstrate the javascript actually executing? Some of the people who got the slop compiling claimed credibly that it isn't executing any JavaScript.

You merely have to compile your code, run the binary and open this page - http://acid3.acidtests.org. Feel free to post a video of yourself doing this. Try to avoid the embellishment that has characterised this effort so far.

◧◩◪◨
242. Snuggl+3G2[view] [source] [discussion] 2026-01-17 11:07:29
>>nindal+pD2
This is from the "official" build - https://imgur.com/fqGLjSA

The "in progress" build has a slightly different rendering but the same result

◧◩
247. RandyO+WI2[view] [source] [discussion] 2026-01-17 11:42:58
>>wilson+ri2
Hi, there. Two questions about this repo [0].

Can you show us what you did after people failed to compile that project [1]?

There are also questions about the attribution of these commits [2]. Can you share some information?

[0] https://github.com/wilsonzlin/fastrender [1] https://github.com/wilsonzlin/fastrender/issues/98 [2] https://gist.github.com/embedding-shapes/d09225180ea3236f180...

◧◩◪◨⬒
251. embedd+3L2[view] [source] [discussion] 2026-01-17 12:05:01
>>felipe+hC2
> As discussed elsewhere, it is apparently possible to compile and run this particular project.

After a human stepped in to fix it, yes. You can see it yourself here: https://github.com/wilsonzlin/fastrender/issues/98

> Nevertheless, IMHO what’s interesting about this is not the browser itself but rather that AI companies (not just Cursor) are building systems where humans can be out of the loop for days or weeks.

But that's not what they demonstrated here. What they demonstrated, so far, is that you can let agents write millions of lines of code, and eventually if you actually need to run it, some human need to "merge the latest snapshot" or do some other management to actually put together the system into a workable state.

Very different from what their original claims were.

◧◩◪◨⬒
253. DonHop+xL2[view] [source] [discussion] 2026-01-17 12:10:08
>>dragon+Zh2
3000x Faster Optimized Random Number Generator: https://xkcd.com/221/
◧◩◪◨⬒⬓⬔⧯
255. DonHop+1N2[view] [source] [discussion] 2026-01-17 12:25:39
>>teifer+Ey2
And how is that not good for humanity in an evolutionary sense (as long as it doesn't kill or maim anyone else)?

Tesla owner keeps using Autopilot from backseat—even after being arrested:

https://mashable.com/article/tesla-autopilot-arrest-driving-...

◧◩◪◨⬒⬓⬔
257. DonHop+eO2[view] [source] [discussion] 2026-01-17 12:36:31
>>moregr+vW1
Nice blog post, gp serial entrepreneur founder bro -- what did your investors think of that?

http://www.mickdarling.com/2019/07/26/busy-summer/

  An embedded page at landr-atlas.com says:

  Attention!

  MacOS Security Center has identified that your system is under threat. 
  Please scan your MacOS as soon as possible to avoid more damage.
  Don't leave this page until you have undertaken all the suggested steps 
  by authorised Antivirus.

  [OK]
◧◩◪◨⬒⬓⬔
260. horsaw+qS2[view] [source] [discussion] 2026-01-17 13:19:20
>>drawfl+wN2
Mmm, as someone forced to write a lot of last minute demos for a startup right out of school that ended up raising ~100MM, there's a fair bit of wiggle room in "Functional".

Not that I would excuse Cursor if they're fudging this either - My opinion is that a large part of the growing skepticism and general disillusionment that permeates among engineers in the industry (ex - the jokes about exiting tech to be a farmer or carpenter, or things like https://imgur.com/6wbgy2L) comes from seeing first hand that being misleading, abusive, or outright lying are often rewarded quite well, and it's not a particularly new phenomenon.

266. utopia+533[view] [source] 2026-01-17 14:50:53
>>embedd+(OP)
That's kind of hilarious (...ly sad) to read knowing that I have on my desk https://browser.engineering so I literally went the opposite direction some months ago.

Not only did I actually build a Web browser myself, from scratch (ok OK of course with a working OS and Python, and its libraries ;) but mine, did work! And it took me what, few hours, maybe few days if adding it altogether but, not only it did work (namely I did browse my own Website with it) but I had fun with it (!), I learned quite a bit with it (including the provable fact that I can indeed build a Web browser, woohoo!) and finally I did it on... I want say few kilowatts at most, including my computer (obviously) but also myself and the food I ate along the way.

So... to each their own ̄\_ (ツ)_/ ̄

◧◩◪
280. nindal+fj3[view] [source] [discussion] 2026-01-17 16:48:58
>>wilson+2w2
You're claiming that the JS VM was implemented. Is it actually running? Because this screenshot shows that the ACID3 benchmark is requesting that you enable JavaScript (https://imgur.com/fqGLjSA). Why don't you upload a video of you loading this page?

Your slop is worthless except to convince gullible investors to give you more money.

◧◩◪
282. Snuggl+oo3[view] [source] [discussion] 2026-01-17 17:21:49
>>embedd+iL2
I've watched them today work in the new repo - https://github.com/wilson-anysphere/fastrender/tree/main , adding another 50k lines trying to optimize scroll/rendering performance (spoiler: not really)

At this point, its 1.5mlocs without the vendored crates (so basically excluding the js engine etc). If you compare that to Servo/Ladybird which are 300k locs each and actually happen to work, agents do love slinging slop.

◧◩◪
290. pera+dL3[view] [source] [discussion] 2026-01-17 19:31:36
>>M4v3R+Do2
"Borrow" is an interesting choice of word, see for example this:

    /// The quirks mode of the document.
    #[inline]
    pub fn quirks_mode(&self) -> QuirksMode {
        self.quirks_mode
    }
https://github.com/wilsonzlin/fastrender/blob/3e5bc78b075645...

And then this:

    /// The quirks mode of the document.
    pub fn quirks_mode(&self) -> QuirksMode {
        self.stylist.quirks_mode()
    }
https://github.com/servo/stylo/blob/71737ad5c8b29c143a6c992a...

It seems ChatGPT is still copying segments of code almost verbatim, although sometimes it does weird things, compare these for example:

https://github.com/wilsonzlin/fastrender/blob/3e5bc78b075645...

https://github.com/servo/stylo/blob/71737ad5c8b29c143a6c992a...

◧◩◪◨
292. Snuggl+cO3[view] [source] [discussion] 2026-01-17 19:52:47
>>pera+dL3
Well, could it be because it was instructed to kinda "study" Servo?

https://github.com/wilsonzlin/fastrender/blob/3e5bc78b075645...

◧◩◪
294. nindal+FO3[view] [source] [discussion] 2026-01-17 19:56:42
>>M4v3R+Do2
In your hurry to defend this slop you didn't do your due diligence. You know that 1 million LoC JS VM? Yeah, it isn't actually running - https://imgur.com/fqGLjSA. And you can tell this is actually the case because it's been brought up a few times on this thread and that guy has ducked around it.
303. simonw+pW6[view] [source] 2026-01-18 23:56:03
>>embedd+(OP)
They fixed FastRender so that CI passes and added build instructions to the README. I've tried it and it works surprisingly well - screenshots here: https://gist.github.com/simonw/53a725811db8e34f4f99226e8f456...
[go to top]