I still haven't experienced a local model that fits on my 64GB MacBook Pro and can run a coding agent like Codex CLI or Claude code well enough to be useful.
Maybe this will be the one? This Unsloth guide from a sibling comment suggests it might be: https://unsloth.ai/docs/models/qwen3-coder-next
In a day or two I'll release my answer to this problem. But, I'm curious, have you had a different experience where tool use works in one of these CLIs with a small local model?
Anywayz, maybe I should try some other models. The ones that haven't worked for tool calling, for me are:
Llama3.1
Llama3.2
Qwen2.5-coder
Qwen3-coder
All these in 7b, 8b, or sometimes 30b (painfully) models.
I should also note that I'm typically using Ollama. Maybe LM Studio or llama.cpp somehow improve on this?