zlacker

But this doesn’t apply to self-hosted, o?

replies(1): >>lyjack+Le

>>Comput+(OP)
It does. LLMs are most efficient when running large batches, so the gpu cost is super high if you’re underutilizing it. It will cost more than a cloud provider like open ai who has the volume to keep their GPUs saturated

replies(1): >>jay-ba+jt

>>lyjack+Le
Yup. It’s also important to mention that OpenAI enjoys the luxury of having large clusters of H100s (the last time I checked).

[go to top]