zlacker
[parent]
[thread]
0 comments
1. mark_l+(OP)
[view]
[source]
2023-11-19 04:53:55
Another data point: I can (barely) run a 30B 4 bit quantized model on a Mac Mini with 32G on chip memory but it runs slowly (a little less than 10 tokens/second).
13B and 7B models run easily and much faster.
[go to top]