Claude Code: connect to a local model when your quota runs out

>>fugu2+(OP)
Useful tip.

From a strategic standpoint of privacy, cost and control, I immediately went for local models, because that allowed to baseline tradeoffs and it also made it easier to understand where vendor lock-in could happen, or not get too narrow in perspective (e.g. llama.cpp/open router depending on local/cloud [1] ).

With the explosion of popularity of CLI tools (claude/continue/codex/kiro/etc) it still makes sense to be able to do the same, even if you can use several strategies to subsidize your cloud costs (being aware of the lack of privacy tradeoffs).

I would absolutely pitch that and evals as one small practice that will have compounding value for any "automation" you want to design in the future, because at some point you'll care about cost, risks, accuracy and regressions.

[1] - https://alexhans.github.io/posts/aider-with-open-router.html

[2] - https://www.reddit.com/r/LocalLLaMA

>>alexha+HPb
can you recommend a setup with ollama and a cli tool? Do you know if I need a licence for Claude if I only use my own local LLM?

>>mogoma+GQb
What are your needs/constraints (hardware constraints definitely a big one)?

The one I mentioned called continue.dev [1] is easy to try out and see if it meets your needs.

Hitting local models with it should be very easy (it calls APIs at a specific port)

[1] - https://github.com/continuedev/continue

zlacker