zlacker

I don't suppose you could point to any resources on where I could get started. I have a M2 with 64gb of unified memory and it'd be nice to make it work rather than burning Github credits.

replies(1): >>EagnaI+P2

>>BuildT+(OP)
https://ollama.com

Although I'm starting to like LMStudio more, as it has more features that Ollama is missing.

https://lmstudio.ai

You can then get Claude to create the MCP server to talk to either. Then a CLAUDE.md that tells it to read the models you have downloaded, determine their use and when to offload. Claude will make all that for you as well.