zlacker

They can absolutely outperform gpt4 for specific use cases.

replies(3): >>nickth+o >>TOMDM+N >>holodu+Zh

>>achill+(OP)
I am very open to believing that. I'd love to see some examples.

replies(4): >>GaggiX+Q1 >>turnso+a2 >>shiftp+h5 >>buggle+77

>>achill+(OP)
Yeah, a 7B foundation model is of course going to be worse when expected to perform on every task.

But finetuning on just a few tasks?

Depending on the task, it's totally reasonable to expect that a 7B model might eke out a win against stock GPT4. Especially if there's domain knowledge in the finetune, and the given task is light on demand for logical skills.

>>nickth+o
Well it's pretty easy to find examples online, this one using Llama 2, not even Mistral or fancy techniques: https://www.anyscale.com/blog/fine-tuning-llama-2-a-comprehe...

>>nickth+o
I agree, I think they need an example or two on that blog post to back up the claim. I'm ready to believe it, but I need something more than "diverse customer tasks" to understand what we're talking about.

>>nickth+o
They're quite close in arena format: https://chat.lmsys.org/?arena

replies(1): >>TOMDM+p7

>>nickth+o
You can fine-tune a small model yourself and see. GPT-4 is an amazing general model, but won’t perform the best at every task you throw at it, out of the box. I have a fine-tuned Mistral 7B model that outperforms GPT 4 on a specific type of structured data extraction. Maybe if I fine-tuned GPT-4 it could beat it, but that costs a lot of money for what I can now do locally for the cost of electricity.

>>shiftp+h5
To be clear, Mixtral is very competitive, Mistral while certainly way better than most 7B models performs far worse than ChatGPT3.5 Turbo.

replies(1): >>shiftp+Yc

>>TOMDM+p7
Apologies, that's what I get for skimming through the thread.

>>achill+(OP)
Not for translations. Did a lot of experimenting different local models. None come even a bit close to the capabilities of chatgpt. Most local models just outputting plain wrong intormation. I am still hoping one day it will be possible. For our business a huge opportunity.

replies(1): >>ijk+eY

>>holodu+Zh
For translation, you're probably better off with a model that's specifically designed for translation, like MADLAD-400 or DeepL's services.