zlacker

Based on my research, GPT-3.5 is likely significantly smaller than 70B parameters, so it would make sense that it's cheaper to run. My guess is that OpenAI significantly overtrained GPT-3.5 to get as small a model as possible to optimize for inference. Also, Nvidia chips are way more efficient at inference than M1 Max. OpenAI also has the advantage of batching API calls which leads to better hardware utilization. I don't have definitive proof that they're not dumping, but economies of scale and optimization seem like better explanations to me.

replies(2): >>hutzli+h2 >>csjh+Wg

>>sacred+(OP)
I also do not have proof of anything here, but can't it be both?

They have lots of money now and the market lead. They want to keep the lead and some extra electricity and hardware costs are surely worth it for them, if it keeps the competition from getting traction.

>>sacred+(OP)
What makes you think 3.5 is significantly smaller than 70B?