Mistral 7B Fine-Tune Optimized

>>tosh+(OP)
Anytime I see a claim that our 7b models are better than gpt-4 I basically stop reading. If you are going to make that claim, give me several easily digestible examples of this taking place.

>>nickth+Cb
Anecdotally, I finetuned Mistral 7B for a specific (and slightly unusual) natural language processing task just a few days ago. GPT-4 can do the task, but it needs a long complex prompt and only gets it right about 80-90% of the time - the finetuned model performs significantly better with fewer tokens. (In fact it does so well that I suspect I could get good results with an even smaller model.)

>>thorum+Xl
I have a fine tuned version of Mistral doing a really simple task and spitting out some JSON. I'm getting equivalent performance to GPT-4 on that specialized task. It's lower latency, it's outputting more tokens/sec., more reliable, private, and completely free.

I don't think we will have an Open Source GPT4 for a long time so this is sorta clickbait, but for the small, specialized tasks, tuned on high quality data, we are already in the "Linux" era of OSS models. They can do real, practical work.

>>oceanp+Xm
> completely free

Not according to my calculation. For low request rate it is likely more expensive than GPT4.

zlacker