zlacker

Is inference really that expensive? Anyway if the price is too low they could easily charge by query

replies(1): >>knicho+g7

>>coffee+(OP)
When I was mining with a bunch of RTX 3080s and RTX 3090s, the electricity cost (admittedly) was about $20/month per card. Running a 70B model takes 3-4 cards. Assuming you're pushing these cards to their extreme max, it's going to be $80/mo. Then again, ChatGPT is pretty awesome, and is likely running more than a 70B model (or I think I heard it was running an ensemble of models), so there's at least a ballpark.

replies(3): >>sodali+ha >>Sebb76+og >>698969+491

>>knicho+g7
Batched inference makes these calculations hard - roughly takes the same amount of power and time for one inference vs 30 (as i understand it)

>>knicho+g7
Datacenters probably do not pay retail rates on electricity, so they might actually run quite a bit cheaper (or more expensive if they use highly available power, but this seems like overkill for pure compute power).

replies(1): >>015a+VF

>>Sebb76+og
Sure, but everything else about a data center is more expensive (real estate, operations people, networking, equipment). There's a reason AWS is so expensive.

>>knicho+g7
Presumably your miner is running 24/7 throughout the month. Not the same for ChatGPT which would answer maybe 10 sessions (with multiple pauses between queries) tops from a single person in a day.