zlacker

[parent] [thread] 1 comments
1. pstuar+(OP)[view] [source] 2026-02-04 01:27:25
Might there be a way to leverage local models just to help minimize the retries -- doing the tool calling handling and giving the agent "perfect execution"?

I'm a noob and am asking as wishful thinking.

replies(1): >>jermau+0v1
2. jermau+0v1[view] [source] 2026-02-04 14:03:34
>>pstuar+(OP)
> I'm a noob and am asking as wishful thinking.

Don't minimize your thoughts! Outside voices and naive questions sometimes provide novel insights that might be dismissed, but someone might listen.

I've not done this exactly, but I have setup "chains" that create a fresh context for tool calls so their call chains don't fill the main context. There is no reason why the Tool Calls couldn't be redirected to another LLM endpoint (local for instance). Especially with something like gpt-oss-20b, where I've found executing tools happens at a higher success than claude sonnet via openrouter.

[go to top]