zlacker

[return to "Zebra-Llama: Towards Efficient Hybrid Models"]
1. aditya+79[view] [source] 2025-12-06 21:45:45
>>mirrir+(OP)
Due to perverse incentives and the historical nature of models over-claiming accuracy, it's very hard to believe anything until it is open source and can be tested out

that being said, I do very much believe that computational efficiency of models is going to go up [correction] drastically over the coming months, which does pose interesting questions over nvidia's throne

*previously miswrote and said computational efficiency will go down

◧◩
2. ACCoun+ff[view] [source] 2025-12-06 22:40:19
>>aditya+79
I don't doubt the increase in efficiency. I doubt the "drastically".

We already see models become more and more capable per weight and per unit of compute. I don't expect a state-change breakthrough. I expect: more of the same. A SOTA 30B model from 2026 is going to be ~30% better than one from 2025.

Now, expecting that to hurt Nvidia? Delusional.

No one is going to stop and say "oh wow, we got more inference efficiency - now we're going to use less compute". A lot of people are going to say "now we can use larger and more powerful models for the same price" or "with cheaper inference for the same quality, we can afford to use more inference".

◧◩◪
3. colech+xg[view] [source] 2025-12-06 22:51:28
>>ACCoun+ff
Eh.

Right now, Claude is good enough. If LLM development hit a magical wall and never got any better, Claude is good enough to be terrifically useful and there's diminishing returns on how much good we get out of it being at $benchmark.

Saying we're satisfied with that... well how many years until efficiency gains from one side and consumer hardware from the other meet in the middle so "good enough for everybody" open models are available for anyone who wants to pay for a $4000 MacBook (and after another couple of years a $1000 MacBook, and several more and a fancy wristwatch).

Point being, unless we get to a point where we start developing "models" that deserve civil rights and citizenship, the years are numbered to where we NEED cloud infrastructure and datacenters full of racks and racks of $x0,000 hardware.

I strongly believe the top end of the S curve is nigh, and with it we're going to see these trillion dollar ambitions crumble. Everybody is going to want a big-ass GPU and a ton of RAM but that's going to quickly become boring because open models are going to exist that eat everybody's lunch and the trillion dollar companies trying to beat them with a premium product aren't going to stack up outside of niche cases and much more ordinary cloud compute motivations.

◧◩◪◨
4. buu700+iv[view] [source] 2025-12-07 00:56:41
>>colech+xg
Coding capability in and of itself may be "good enough" or close to it, but there's a long way to go before AI can build and operate a product end-to-end. In fairness, a lot of the gap may be tooling.

But the end state in my mind is telling an AI "build me XYZ", having it ask all the important questions over the course of a 30-minute chat while making reasonable decisions on all lower-level issues, then waking up the next morning to a live cloud-hosted test environment at a subdomain of the domain it said it would buy along with test builds of native apps for Android, iOS, Linux, macOS, and Windows, all with near-100% automated test coverage and passing tests. Coding agents feel like magic, but we're clearly not there yet.

And that's just coding. If someone wanted to generate a high-quality custom feature-length movie within the usage limits of a $20/mo AI plan, they'd be sorely disappointed.

[go to top]