zlacker

[parent] [thread] 2 comments
1. embedd+(OP)[view] [source] 2026-01-28 10:49:58
I don't, fits on my card with the full context, I think the native MXFP4 weights takes ~70GB of VRAM (out of 96GB available, RTX Pro 6000), so I still have room to spare to run GPT-OSS-20B alongside for smaller tasks too, and Wayland+Gnome :)
replies(1): >>storys+Sb
2. storys+Sb[view] [source] 2026-01-28 12:24:54
>>embedd+(OP)
I thought the RTX 6000 Ada was 48GB? If you have 96GB available that implies a dual setup, so you must be relying on tensor parallelism to shard the model weights across the pair.
replies(1): >>embedd+5e
◧◩
3. embedd+5e[view] [source] [discussion] 2026-01-28 12:40:08
>>storys+Sb
RTX Pro 6000 - 96GB VRAM - Single card
[go to top]