zlacker

[return to "Qwen3-Coder-Next"]
1. daniel+11[view] [source] 2026-02-03 16:06:12
>>daniel+(OP)
For those interested, made some Dynamic Unsloth GGUFs for local deployment at https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF and made a guide on using Claude Code / Codex locally: https://unsloth.ai/docs/models/qwen3-coder-next
◧◩
2. ranger+n6[view] [source] 2026-02-03 16:27:45
>>daniel+11
What is the difference between the UD and non-UD files?
◧◩◪
3. daniel+17[view] [source] 2026-02-03 16:29:55
>>ranger+n6
UD stands for "Unsloth-Dynamic" which upcasts important layers to higher bits. Non UD is just standard llama.cpp quants. Both still use our calibration dataset.
◧◩◪◨
4. Camper+1k[view] [source] 2026-02-03 17:24:32
>>daniel+17
Please consider authoring a single, straightforward introductory-level page somewhere that explains what all the filename components mean, and who should use which variants.

The green/yellow/red indicators for different levels of hardware support are really helpful, but far from enough IMO.

◧◩◪◨⬒
5. daniel+Bo[view] [source] 2026-02-03 17:40:51
>>Camper+1k
Oh good idea! In general UD-Q4_K_XL (Unsloth Dynamic 4bits Extra Large) is what I generally recommend for most hardware - MXFP4_MOE is also ok
◧◩◪◨⬒⬓
6. Keats+KO[view] [source] 2026-02-03 19:20:08
>>daniel+Bo
Is there some indication on how the different bit quantization affect performance? IE I have a 5090 + 96GB so I want to get the best possible model but I don't care about getting 2% better perf if I only get 5 tok/s.
◧◩◪◨⬒⬓⬔
7. mirekr+D31[view] [source] 2026-02-03 20:24:55
>>Keats+KO
It takes download time + 1 minute to test speed yourself, you can try different quants, it's hard to write down a table because it depends on your system ie. ram clock etc. if you go out of gpu.

I guess it would make sense to have something like max context size/quants that fit fully on common configs with gpus, dual gpus, unified ram on mac etc.

◧◩◪◨⬒⬓⬔⧯
8. Keats+2c1[view] [source] 2026-02-03 21:06:00
>>mirekr+D31
Testing speed is easy yes, I'm mostly wondering about the quality difference between Q6 vs Q8_K_XL for example.
[go to top]