zlacker

We've tried to sell variants of the open source models to our existing enterprise customers.

I think the adage about "a solution needs to be 10x other solutions to make someone switch" applies here.

Saying something performs slightly better than the industry standard offerings (OpenAI) means that OpenAI is going to laugh all the way to the bank. Everyone will just use their APIs over anything else.

I'm excited about the LLM space and I can barely keep up with the model names, much less all the techniques for fine tuning. A customer is going to have an even worse time.

No one will ever get fired for buying OpenAI (now that IBM is dead, and probably sad Watson never made a dent).

I do use Mistral for all my personal projects but I'm not sure that is going to have the same effect on the industry as open source software did in the past.

replies(13): >>turnso+S >>esafak+01 >>buggle+G1 >>wavemo+S1 >>oceanp+82 >>moneyw+f2 >>kcorbi+d3 >>jdwyah+K3 >>Joeri+24 >>ren_en+k4 >>mmcwil+i5 >>xrd+ih >>zmmmmm+3I

>>xrd+(OP)
There's a lot of truth to this, but I have seen clients get really interested in local models—mostly due to cost and/or confidentiality. For example, some healthcare clients will never upload medical records to OpenAI, regardless of the enterprise agreement.

>>xrd+(OP)
OpenAI is nothing like IBM in its heyday. I bet a very healthy proportion of companies will not share their data with OpenAI. I saw some numbers on this a while back I don't have the link handy. Trust has to be earned.

>>xrd+(OP)
The problem here is that the platform offering here is overly complicated to get started with and quite limited. 2000 dataset entries for $50 a month when I can do 10x as many as that on Colab for free with axolotl or unsloth? Yeah, no thanks.

>>xrd+(OP)
> I think the adage about "a solution needs to be 10x other solutions to make someone switch" applies here.

Cheaper and faster is also better. The cheapest version of GPT-4 costs $0.01/$0.03 per 1K input/output tokens [1]. Mistral AI is charging 0.14€/0.42€ per ONE MILLION input/output tokens for their 7B model [2]. It's night and day.

If people can start fine-tuning a 7B model to do the same work they were doing with GPT-4, they will 100% switch.

[1]: https://help.openai.com/en/articles/7127956-how-much-does-gp...

[2]: https://docs.mistral.ai/platform/pricing/

>>xrd+(OP)
> I think the adage about "a solution needs to be 10x other solutions to make someone switch" applies here.

It's already superior to OpenAI because it doesn't require an API. You can run the model on your own hardware, in your own datacenter, and your data is guaranteed to remain confidential. Creating a one-off fine-tune is a different story than permanently joining your company at the hip to OpenAI.

I know in our bubble, in the era of Cloud, it's easy to send confidential company data to some random API on the Internet and not worry about it, but that's absolutely not the case for anyone in Healthcare, Government, or even normal companies that are security conscious. For them, OpenAI was never a valid consideration in the first place.

replies(2): >>moneyw+m2 >>wenc+PU

>>xrd+(OP)
how are you using it for your project?

>>oceanp+82
what is the most prominent use case for private LLMs, doctor notes?

replies(7): >>miohta+H2 >>noitpm+L2 >>sergio+Y2 >>bbor+93 >>mrinte+e5 >>fo76yo+S6 >>potato+fw

>>moneyw+m2
Anything related to the business or medium and large enterprises, government

>>moneyw+m2
Definitely healthcare, or for certain industries (HFT/Finance/...) where for various reasons _everything_ must be run on prem.

replies(1): >>Foobar+bV

>>moneyw+m2
You could use it to query against any kind of B2B customer information and provide insight, citations and context without any of the data leaving your private server.

When building something similar powered by OpenAI I had a real pain in the ass anonymizing the data, then de-anonymizing the answers before showing it to the customer.

Also in my example, I'm sure using a string like "Pineapple Cave Inc." instead of the real business name hurt the AI's ability to contextualize the information and data and that hurt the LLM somewhat -- right?

>>moneyw+m2
Great answers above, but long term: Personal assistants. I truly think that’s a privacy line people won’t cross, even after seeing Alexa and Google Maps enter into our lives; I think people would rather have nothing than a robot that knows every detail of their health, schedule, feelings, plans, etc. in some vaguely defined server somewhere.

replies(1): >>tomdun+N4

>>xrd+(OP)
Hey, I'm the post author. This is a totally fair point! I do think though that depending on your specific requirements open-source models can be a 10x+ improvement. For example, we serve Mistral 7B for less than 1/10th the cost of GPT-4-Turbo, which is the model most of our users are comparing us to.

replies(2): >>xrd+ac >>MacsHe+L51

>>xrd+(OP)
The real thing is the switching costs. Sure we start with openAI. But at some hackathon in 9 months somebody will try mistral and if that saves real money and still works it feels like any easy swap.

>>xrd+(OP)
Actually, I think microsoft is going to laugh all the way to the bank, because probably most enterprises will use the Azure OpenAI service instead of directly buying OpenAI’s offerings.

>>xrd+(OP)
all they need is an API compatible client library so there is no actual switching cost between models other than configuration. There's a reason OpenAI is adding all sorts of add-on features like assistants and file upload, because they know models themselves are going to be a commodity and they need something to lock developers on their platform

replies(1): >>visarg+lR

>>bbor+93
Don’t Google already have that information from your searches, emails, calendar, etc? Obviously you have to trust they don’t misuse it, but it’s basically the same thing as some personal assistant having it to me.

replies(2): >>bbor+F5 >>samus+Z68

>>moneyw+m2
Proprietary and sensitive information. Personally, I use a self-hosted LLM because I don't trust how my conversations with hosted generative AI services will be used.

replies(1): >>aussie+Oh

>>xrd+(OP)
I think at this point the "10x other solutions" should be measured for the cost. If I can process, in perpetuity, 100s of millions of tokens for the cost that OpenAI can do for 10s of millions of tokens one time, that is already past the threshold.

>>tomdun+N4
Yeah, but I think this is less of a technical line than an emotional one.

For example: I wanted my personal assistant to track hygiene, which is a natural use case. But then you arrive at the natural conclusion that either a) the user needs to enter the data themselves (“I brushed my teeth and washed my face and took X medications at Y time”), or b) you need some sort of sensor in the bathroom, ranging from mics or radio sensors up to a tasteful camera. And a million subtle versions of (b) is where I see people going “no, that’s weird, it’s too much info all together”

>>moneyw+m2
Personalized metaspaces, game worlds, content without paying a rent seeker copyright holder.

Education and research without gatekeepers in academia and industry complaining about their book sales or prestige titles being obsoleted

Whole lot of uses cases that break us out of having to kowtow to experts who were merely born before us trying to monopolize exploration of science and technology

To that end I’m working on a GPU accelerated client backed by local AI, with NERFs and Gaussian splatting built in.

The upside to being an EE with MSc in math; most of my money comes from engineering real things. I don’t have skin in the cloud CRUD app/API game and don’t see a reason to spend money propping up middle men who, given my skills and abilities, don’t add value

Programmers can go explore syntax art in their parent’s basement again. Tired of 1970s semantics and everyone with a DSL thinking that’s the best thing to happen to computing as a field of inquiry ever.

Like all industries big tech is monopolized by aging rent seekers. Disrupt by divesting from it is my play now.

replies(1): >>refulg+Fg

>>kcorbi+d3
This is the 10x I was looking for. Great post by the way!

>>fo76yo+S6
This translates to "right now, porn" and aspirations. (n.b. NERFs that can be rendered client side take O(days) to train with multiple A100s)

replies(1): >>fo76yo+Um

>>xrd+(OP)
I've read all the comments here, many of which contradict my points. I used to agree with those ideas but then tried to sell LLMs to customers. My takeaway is that customers will pretend they care about privacy and needs for on prem installations, but they will, at least right now, go with a vendor that tells them their data is protected and not investigate the truth of that.

Zoom got away with it and still does and no one got fired for using zoom.

I'm happy to have a debate with someone that has successfully sold those ideas to a customer, but I'm skeptical until then.

>>mrinte+e5
This. I also use open source self hosted LLMs for exactly this reason.

Sure, I use OpenAI APIs for certain heavy lifting tasks that don't involve sensitive information, but for anything sensitive it's self hosted LLMs all the way.

>>refulg+Fg
Forgot re-creation/preservstion of existing content I paid for by translating footage into physics, color, and geometry models, map them to my clients render pipeline. Level 1-1 of New Super Mario Bros is pretty much completely translated. No copyright problems if I don’t distribute it :)

Like I said, most of my money is wfh design of branded gadgets. Not really the sort to care about the reach of others; if content industry collapses because people don’t need to spend money on it, meh. More interested in advancing computing. Pour money into R&D of organic computers, rather than web apps running on the same old gear with more HP under the hood. yawn

I want bioengineered kaiju sized dogs and drug glands that stoke hallucination I’m on another planet.

Humanity is a generational cup and string. Time to snip the 1900s loose.

>>moneyw+m2
Nope, they're using GPT for those

https://blogs.microsoft.com/blog/2023/08/22/microsoft-and-ep...

>>xrd+(OP)
I have the opposite problem.

We are beseiged by vendors promising the earth from their amazing AI tools and we peel back 1 surface layer and they are just shoving things wholesale into GPT-4. When I ask "can we please deploy this on a local model" they run off scared. I can't get any vendor to give us anything except OpenAI.

>>ren_en+k4
Code execution and RAG are not going to lock people in. They are 1000x easier to replicate than the model, which as you say, is already becoming a commodity.

My pet theory is that OpenAI are cooking high quality user data by empowering GPT with all these toys + human-in-the-loop. The purpose is to use this data as a sort of continual evaluation sifting for weak points and enhancing their fine-tuning datasets.

Every human response can carry positive or negative connotation. The model can use that as a reward signal. They claimed to have 100M users, times let's say 10K tokens per month makes 1T synthetic tokens. In a whole year they generate about as much text as the original dataset, 13T. And we know that LLMs can benefit a lot from synthetic data when it is filtered/engineered for quality.

So I think OpenAI's moat is the data they generate.

>>oceanp+82
> It's already superior to OpenAI because it doesn't require an API.

But the quality is not superior to OpenAI however. I run Mistral 7B on LM Studio, and I can't get far before it starts giving me wrong answers.

ChatGPT-4 on the other hand is correct most of the time (and knows to trigger Python code evaluation or RAG to answer questions). This makes it useful.

>>noitpm+L2
As long as you fit your regulatory requirements, it's incorrect.

>>kcorbi+d3
I serve ~300tk/s of Mistral 7B for $0.60/hr by renting a cloud 3090. That's a lot cheaper than GPT-4-Turbo, though the quality is closer to GPT-3.5.

Mixtral 8x7b is closer to GPT-4 quality though and only 2x the compute requirement of Mistral 7B.

>>tomdun+N4
It's not about Google anymore; that ship has sailed for most people by now. But about giving all this data to yet another company. Also, it's not the same data at all.

Some data might never travel across a Google account, but very well over ChatGPT.

If you're processing personal data of other person, then you don't really have a choice in the matter: gain permission from them to transfer their data to a third party or self-host the model.