zlacker

[return to "Qwen3-Coder-Next"]
1. simonw+l3[view] [source] 2026-02-03 16:15:21
>>daniel+(OP)
This GGUF is 48.4GB - https://huggingface.co/Qwen/Qwen3-Coder-Next-GGUF/tree/main/... - which should be usable on higher end laptops.

I still haven't experienced a local model that fits on my 64GB MacBook Pro and can run a coding agent like Codex CLI or Claude code well enough to be useful.

Maybe this will be the one? This Unsloth guide from a sibling comment suggests it might be: https://unsloth.ai/docs/models/qwen3-coder-next

◧◩
2. segmon+mt[view] [source] 2026-02-03 17:58:34
>>simonw+l3
you do realize claude opus/gpt5 are probably like 1000B-2000B models? So trying to have a model that's < 60B offer the same level of performance will be a miracle...
◧◩◪
3. jrop+Gw[view] [source] 2026-02-03 18:10:34
>>segmon+mt
I don't buy this. I've long wondered if the larger models, while exhibiting more useful knowledge, are not more wasteful as we greedily explore the frontier of "bigger is getting us better results, make it bigger". Qwen3-Coder-Next seems to be a point for that thought: we need to spend some time exploring what smaller models are capable of.

Perhaps I'm grossly wrong -- I guess time will tell.

◧◩◪◨
4. bityar+zK[view] [source] 2026-02-03 19:02:44
>>jrop+Gw
You are not wrong, small models can be trained for niche use cases and there are lots of people and companies doing that. The problem is that you need one of those for each use case whereas the bigger models can cover a bigger problem space.

There is also the counter-intuitive phenomenon where training a model on a wider variety of content than apparently necessary for the task makes it better somehow. For example, models trained only on English content exhibit measurably worse performance at writing sensible English than those trained on a handful of languages, even when controlling for the size of the training set. It doesn't make sense to me, but it probably does to credentialed AI researchers who know what's going on under the hood.

◧◩◪◨⬒
5. sally_+pP3[view] [source] 2026-02-04 15:59:20
>>bityar+zK
Cool, I didn't know about this phenomenon. Reading up a little it seems like training multilingual forces the model to optimize it's internal "conceptual layer" weights better instead of relying solely on English linguistics. Papers also mention issues arising from overdoing it, so my guess is even credentialed AI researchers are currently limited to empirical methods here.
[go to top]