zlacker

[parent] [thread] 4 comments
1. sebast+(OP)[view] [source] 2023-11-19 03:30:07
What specs are needed to run those models in your local machine without crashing the system?
replies(3): >>xigenc+K6 >>mark_l+lb >>throwa+Ng
2. xigenc+K6[view] [source] 2023-11-19 04:16:07
>>sebast+(OP)
I use Faraday.dev on an RTX 3090 and smaller models on a 16gb M2 Mac and I’m able to have deep, insightful conversations with personal AI at my direction.

I find the outputs of LLMs to be quite organic when they are given unique identities, and especially when you explore, prune or direct their responses.

ChatGPT comes across like a really boring person who memorized Wikipedia, which is just sad. Previously the Playground completions allowed using raw GPT which let me unlock some different facets, but they’ve closed that down now.

And again, I don’t really need to feed my unique thoughts, opinions, or absurd chat scenarios into a global company trying to create AGI, or have them censor and filter for me. As an AI researcher, I want the uncensored model to play with along with no data leaving my network.

The uses of LLMs for information retrieval are great (Bing has improved alot) but the much more interesting cases for me are how they are able to parse nuance, tone, and subtext - imagine a computer that can understand feelings and respond in kind. Empathetic commuting, and it’s already here on my PC unplugged from the Internet.

replies(1): >>mark_l+Zb
3. mark_l+lb[view] [source] 2023-11-19 04:53:55
>>sebast+(OP)
Another data point: I can (barely) run a 30B 4 bit quantized model on a Mac Mini with 32G on chip memory but it runs slowly (a little less than 10 tokens/second).

13B and 7B models run easily and much faster.

◧◩
4. mark_l+Zb[view] [source] [discussion] 2023-11-19 04:57:56
>>xigenc+K6
+1 Greg. I agree with most of what you say. Also, it is so much more fun running everything locally.
5. throwa+Ng[view] [source] 2023-11-19 05:41:56
>>sebast+(OP)
check out https://www.reddit.com/r/LocalLLaMA/
[go to top]