zlacker

[parent] [thread] 3 comments
1. Arctic+(OP)[view] [source] 2023-09-12 18:44:59
> Llama 7B wasn't up to the task fyi, producing very poor translations.

From what I've read and personally experimented with, none of the Llama 2 models are well-suited to translation in particular (they were mainly trained on English data). Still, there are a number of tasks that they're really good at if fine-tuned correctly, such as classification and data extraction.

> I believe that OpenAI priced GPT-3.5 aggressively cheap in order to make it a non-brainer to rely on them rather than relying on other vendors (even open source models).

I think you're definitely right about that, and in most cases just using GPT 3.5 for one-off tasks makes the most sense. I think when you get into production workflows that scale, that's when using a small fine-tuned models starts making more sense. You can drop the system prompt and get data in the format you'd expect it in, and train on GPT-4's output to sometimes get better accuracy than 3.5 would give you right off the bat. And keep in mind, while you can do the same thing with a fine-tuned 3.5 model, it's going to cost 8x the base 3.5 price per token.

replies(1): >>kelsey+iO
2. kelsey+iO[view] [source] 2023-09-12 21:44:08
>>Arctic+(OP)
Is that because translation is typically an encoder-decoder task and llama is decoder only or is there something else about it that makes the last difficult for llama?
replies(2): >>Feepin+P01 >>mikewa+h85
◧◩
3. Feepin+P01[view] [source] [discussion] 2023-09-12 22:50:11
>>kelsey+iO
If you don't make it learn other-language texts, it won't be able to speak that language.
◧◩
4. mikewa+h85[view] [source] [discussion] 2023-09-14 06:07:18
>>kelsey+iO
As I learned that 85% of its trainig data is English. Othere languanges composed of 15%.
[go to top]