Qwen3-Coder-Next

>>daniel+(OP)
This GGUF is 48.4GB - https://huggingface.co/Qwen/Qwen3-Coder-Next-GGUF/tree/main/... - which should be usable on higher end laptops.

I still haven't experienced a local model that fits on my 64GB MacBook Pro and can run a coding agent like Codex CLI or Claude code well enough to be useful.

Maybe this will be the one? This Unsloth guide from a sibling comment suggests it might be: https://unsloth.ai/docs/models/qwen3-coder-next

>>simonw+l3
I wonder if the future in ~5 years is almost all local models? High-end computers and GPUs can already do it for decent models, but not sota models. 5 years is enough time to ramp up memory production, consumers to level-up their hardware, and models to optimize down to lower-end hardware while still being really good.

>>dehrma+Rg
Opensource or local models will always heavily lag frontier.

Who pays for a free model? GPU training isn't free!

I remember early on people saying 100B+ models will run on your phone like nowish. They were completely wrong and I don't think it's going to ever really change.

People always will want the fastest, best, easiest setup method.

"Good enough" massively changes when your marketing team is managing k8s clusters with frontier systems in the near future.

>>johnsm+CO
I don't know about frontier, I code nowadays a lot using Opus 4.5, in a way that I instruct it to do something (like complex refactor etc) - I like that it's really good at actually doing what its told and only occasionally do I have to fight it when it goes off the rails. It also does not hallucinate all that much in my experience (Im writing Js, YMMV with other languages), and is good at spotting dumb mistakes.

That said, I'm not sure if this capability is only achievable in huge frontier models, I would be perfectly content using a model that can do this (acting as a force multiplier), and not much else.

zlacker