A 3B resident parameter MOE allows absolutely huge savings on inference costs. I use a cloud provider for models to large to run locally, can’t wit for them to support qwen3-coder-next hopefully in a few days.
So much expensive inference is provided free or at large discounts - that craziness should end.