zlacker

[parent] [thread] 1 comments
1. ein0p+(OP)[view] [source] 2024-02-14 21:14:30
It isn’t capable unless you have a very specialized task and carefully fine tune to solve just that task. GPT4 covers a lot of ground out of the box. The best model I’ve seen so far on the FOSS side, Mixtral MoE, is less capable than even GPT 3.5. I often submit my requests to both Mixtral and GPT4. If I’m problem solving (learning something, working with code, summarizing, working on my messaging) Mixtral is nearly always a waste of time in comparison.
replies(1): >>sjwhev+xt
2. sjwhev+xt[view] [source] 2024-02-14 23:59:51
>>ein0p+(OP)
Again, that’s precisely what I’m saying. A bounded task is best executed against the smallest possible model at the greatest possible speed. This is true for business factors ($$$) as well as environmental (smaller model -> less carbon).

LLM are not AGI, they are tools that have specific uses we are still discovering.

If you aren’t trying to optimize your accuracy to start with and just saying “I’ll run the most expensive thing and assume it is better” with zero evaluation you’re wasting money, time, and hurting the environment.

Also, I don’t even like running Mistral if I can avoid it - a lot of tasks can be done with a fine tune of BERT or DistilBERT. It takes more work but my custom BERT models way outperform GPT-4 on bounded tasks because I have highly curated training data.

Within specialized domains you just aren’t going to see GPT-4/5/6 performing on par with expert curated data.

[go to top]