Fine-tune your own Llama 2 to replace GPT-3.5/4

>>kcorbi+(OP)
For translation jobs, I've experimented with Llama 2 70B (running on Replicate) v/s GPT-3.5;

For about 1000 input tokens (and resulting 1000 output tokens), to my surprise, GPT-3.5 turbo was 100x cheaper than Llama 2.

Llama 7B wasn't up to the task fyi, producing very poor translations.

I believe that OpenAI priced GPT-3.5 aggressively cheap in order to make it a non-brainer to rely on them rather than relying on other vendors (even open source models).

I'm curious to see if others have gotten different results?

>>ronyfa+wk
Yes, openAI is dumping the market with chat-gpt 3.5. Vulture capital behaviour at its finest, and I'm sure government regulations will definitely catch on to this in 20 or 30 years...

It's cheaper than the ELECTRICITY cost of running a llama-70 on your own M1.Max (very energy efficient chip) assuming free hardware.

I guess they are also getting a pretty good cache hit rate - there are only so many questions people ask at scale. But still, it's dumping.

zlacker