zlacker

[parent] [thread] 4 comments
1. 44za12+(OP)[view] [source] 2026-01-26 07:03:53
This is the way. I actually mapped out the decision tree for this exact process and more here:

https://github.com/NehmeAILabs/llm-sanity-checks

replies(2): >>homeon+aB >>andai+45m
2. homeon+aB[view] [source] 2026-01-26 12:45:24
>>44za12+(OP)
That's interesting. Is there any kind of mapping to these respective models somewhere?
replies(1): >>44za12+1H
◧◩
3. 44za12+1H[view] [source] [discussion] 2026-01-26 13:24:03
>>homeon+aB
Yes, I included a 'Model Selection Cheat Sheet' in the README (scroll down a bit).

I map them by task type:

Tiny (<3B): Gemma 3 1B (could try 4B as well), Phi-4-mini (Good for classification). Small (8B-17B): Qwen 3 8B, Llama 4 Scout (Good for RAG/Extraction). Frontier: GPT-5, Llama 4 Maverick, GLM, Kimi

Is that what you meant?

replies(1): >>hyuuu+vsh
◧◩◪
4. hyuuu+vsh[view] [source] [discussion] 2026-01-30 20:58:21
>>44za12+1H
at the sake of being obvious, do you have a tiny llm gating this decision and classifying and directing the task to its appropriate solution?
5. andai+45m[view] [source] 2026-02-01 17:05:40
>>44za12+(OP)
>Before you reach for a frontier model, ask yourself: does this actually need a trillion-parameter model?

>Most tasks don't. This repo helps you figure out which ones.

About a year ago I was testing Gemini 2.5 Pro and Gemini 2.5 Flash for agentic coding. I found they could both do the same task, but Gemini Pro was way slower and more expensive.

This blew my mind because I'd previously been obsessed with "best/smartest model", and suddenly realized what I actually wanted was "fastest/dumbest/cheapest model that can handle my task!"

[go to top]