zlacker

[parent] [thread] 1 comments
1. renjim+(OP)[view] [source] 2025-06-02 21:26:10
You can self host an open-weights LLM. Some of the AI-powered IDEs are open source. It does take a little more work than just using VSCode + Copilot, but that's always been the case for FOSS.
replies(1): >>Philpa+d8
2. Philpa+d8[view] [source] 2025-06-02 22:14:28
>>renjim+(OP)
An important note is that the models you can host at home (e.g. without buying ten(s of) thousand dollar rigs) won't be as effective as the proprietary models. A realistic size limit is around 32 billion parameters with quantisation, which will fit on a 24GB GPU or a sufficiently large MBP. These models are roughly on par with the original GPT-4 - that is, they will generate snippets, but they won't pull off the magic that Claude in an agentic IDE can do. (There's the recent Devstral model, but that requires a specific harness, so I haven't tested it.)

DeepSeek-R1 is on par with frontier proprietary models, but requires a 8xH100 node to run efficiently. You can use extreme quantisation and CPU offloading to run it on an enthusiast build, but it will be closer to seconds-per-token territory.

[go to top]