Fine-tune your own Llama 2 to replace GPT-3.5/4

>>kcorbi+(OP)
For translation jobs, I've experimented with Llama 2 70B (running on Replicate) v/s GPT-3.5;

For about 1000 input tokens (and resulting 1000 output tokens), to my surprise, GPT-3.5 turbo was 100x cheaper than Llama 2.

Llama 7B wasn't up to the task fyi, producing very poor translations.

I believe that OpenAI priced GPT-3.5 aggressively cheap in order to make it a non-brainer to rely on them rather than relying on other vendors (even open source models).

I'm curious to see if others have gotten different results?

>>ronyfa+wk
You can run 70B LLAMA on dual 4090s/3090s with quantization. Going with dual 3090s you can get a system that can run LLAMA 2 70B with 12K context for < $2K.

I built two such a systems after burning that much in a week on ChatGPT.

>>ttt3ts+aq1
> I built two such a systems after burning that much in a week on ChatGPT.

What are you doing!?

>>coryrc+t42
Have a client with many thousands of csv, json, xml files detailing insurance prices. Fundimentally they all contained the same data but wildly different formats because they were produced by different companies and teams. I used ChatGPT to deduce their format so I could normalize them. Easily underbid their current contractor who was using humans for the work and now I have an easy quarterly billing. :)

TBC, I probably could have optimized tokens but contract was profitable and time critical.

>>ttt3ts+DX6
Thanks for sharing!

zlacker