Fine-tune your own Llama 2 to replace GPT-3.5/4

>>kcorbi+(OP)
For translation jobs, I've experimented with Llama 2 70B (running on Replicate) v/s GPT-3.5;

For about 1000 input tokens (and resulting 1000 output tokens), to my surprise, GPT-3.5 turbo was 100x cheaper than Llama 2.

Llama 7B wasn't up to the task fyi, producing very poor translations.

I believe that OpenAI priced GPT-3.5 aggressively cheap in order to make it a non-brainer to rely on them rather than relying on other vendors (even open source models).

I'm curious to see if others have gotten different results?

>>ronyfa+wk
Cost isn't the only incentive not to use an LLM service that resides in a foreign country. Around here, there are industries for which it's pretty much a no-brainer to avoid anything that sends data across the atlantic.

>>Anonym+dB
Although it wouldn't surprise me if today's Azure OpenAI offerings route to certain US-centric regions, I'd be very surprised if Azure isn't working day and night to try to provision OpenAI capacity everywhere they can in the world.

(Disclaimer: I work in the cloud organization at Microsoft, and these are totally my own thoughts and opinions and don't reflect any kind of inside knowledge I have. I think I can say that provisioning LLM capacity and GPU's is something we basically all have a tremendous amount of passion about.)

zlacker