zlacker

[parent] [thread] 8 comments
1. fnordp+(OP)[view] [source] 2023-11-19 01:34:19
If you’ve used the models for actual business problems GPT4 and its successive revisions are way beyond llama. They’re not comparable. I’m a huge fan of open models but it’s just different worlds of power. I’d note OpenAI has been working on GPT5 for some time as well, which I would expect to be a remarkable improvement incorporating much of the theoretical and technical advances of the last two years. Claude is the only actual competitor to GPT4 and it’s a “just barely relevant situation.”
replies(1): >>xigenc+C3
2. xigenc+C3[view] [source] 2023-11-19 01:58:56
>>fnordp+(OP)
Hm, it’s hard for me to say because most of my prompts would get me banned from OpenAI but I’ve gotten great results for specific tasks using finetuned quantized 30B models on my desktop and laptop. All things considered, it’s a better value for me, especially as I highly value openness and privacy.
replies(3): >>sebast+3i >>int_19+dA >>intend+LA
◧◩
3. sebast+3i[view] [source] [discussion] 2023-11-19 03:30:07
>>xigenc+C3
What specs are needed to run those models in your local machine without crashing the system?
replies(3): >>xigenc+No >>mark_l+ot >>throwa+Qy
◧◩◪
4. xigenc+No[view] [source] [discussion] 2023-11-19 04:16:07
>>sebast+3i
I use Faraday.dev on an RTX 3090 and smaller models on a 16gb M2 Mac and I’m able to have deep, insightful conversations with personal AI at my direction.

I find the outputs of LLMs to be quite organic when they are given unique identities, and especially when you explore, prune or direct their responses.

ChatGPT comes across like a really boring person who memorized Wikipedia, which is just sad. Previously the Playground completions allowed using raw GPT which let me unlock some different facets, but they’ve closed that down now.

And again, I don’t really need to feed my unique thoughts, opinions, or absurd chat scenarios into a global company trying to create AGI, or have them censor and filter for me. As an AI researcher, I want the uncensored model to play with along with no data leaving my network.

The uses of LLMs for information retrieval are great (Bing has improved alot) but the much more interesting cases for me are how they are able to parse nuance, tone, and subtext - imagine a computer that can understand feelings and respond in kind. Empathetic commuting, and it’s already here on my PC unplugged from the Internet.

replies(1): >>mark_l+2u
◧◩◪
5. mark_l+ot[view] [source] [discussion] 2023-11-19 04:53:55
>>sebast+3i
Another data point: I can (barely) run a 30B 4 bit quantized model on a Mac Mini with 32G on chip memory but it runs slowly (a little less than 10 tokens/second).

13B and 7B models run easily and much faster.

◧◩◪◨
6. mark_l+2u[view] [source] [discussion] 2023-11-19 04:57:56
>>xigenc+No
+1 Greg. I agree with most of what you say. Also, it is so much more fun running everything locally.
◧◩◪
7. throwa+Qy[view] [source] [discussion] 2023-11-19 05:41:56
>>sebast+3i
check out https://www.reddit.com/r/LocalLLaMA/
◧◩
8. int_19+dA[view] [source] [discussion] 2023-11-19 05:58:15
>>xigenc+C3
Even the best unquantized finetunes of llama2-70b are, at best, somewhat superior to GPT-3.5-turbo (and I'm not even sure they would beat the original GPT-3.5, which was smarter). They are not even close to GPT-4 on any task requiring serious reasoning or instruction following.
◧◩
9. intend+LA[view] [source] [discussion] 2023-11-19 06:04:19
>>xigenc+C3
For an individual use case Llama is fine. If you start getting to large workflows and need reliable outputs, GPT wins out substantially. I know all the papers and headlines about comparative performance, but thats on benchmarks.

Ive found that benchmarks are great as a hygiene test, but pointless when you need to get work done.

[go to top]