It'd make sense to rename WebDev Arena to React/Tailwind Arena. Its system prompt requires [1] those technologies and the entire tool breaks when requesting vanilla JS or other frameworks. The second-order implications of models competing on this narrow definition of webdev are rather troublesome.
[1] https://blog.lmarena.ai/blog/2025/webdev-arena/#:~:text=PROM...
Don't get me stared on how ugly the HTML becomes when most tags have 20 f*cking classes which could have been two.
In typical production environments tailwind is only around 10kb[1].
[1]: https://v3.tailwindcss.com/docs/optimizing-for-production
To me it seems so strange that few good language designers and ml folks didn't group together to work on this.
It's clear that there is a space for some LLM meta language that could be designed to compile to bytecode, binary, JS, etc.
It also doesn't need to be textual like we code, but some form of AST llama can manipulate with ease.
Instead of learnable, stable, APIs for common components with well established versioning and well defined tokens, we've got people literally copying and pasting components and applying diffs so they can claim they "own them".
Except the vast majority of them don't ever change a line and just end up with a strictly worse version of a normal package (typically out of date or a hodgepodge of "versions" because they don't want to figure out diffs), and the few that do make changes don't have anywhere near the design sense to be using shadcn since there aren't enough tokens to keep the look and feel consistent across components.
The would be 1% who would change it and have their own well thought out design systems don't get a lift from shadcn either vs just starting with Radix directly.
-
Amazing spin job though with the "registry" idea too: "it's actually very good for AI that we invented a parallel distribution system for ad-hoc components with no standard except a loose convention around sticking stuff in a folder called ui"
Plenty of training data to go on, I'd imagine.
Funnily, training of these models feels getting cut mid of v3/v4 Tailwind release, and Gemini always try to correct my mistakes (… use v3 instead of v4)
Who will write the useful training data without LLMs? I feel we are getting less and less new things. Changes will be smaller and incremental.
It seems no different in kind to me than image or audio generation.