zlacker
[parent]
[thread]
0 comments
1. chepts+(OP)
[view]
[source]
2023-09-21 12:47:26
How do you fit Llama2-70b into V100? V100 is 16GB. Llama2-70b 4bit would require up to 40GB. Also, what do you use for inference to get 300+tokens/s?
[go to top]