zlacker

[parent] [thread] 1 comments
1. ren_en+(OP)[view] [source] 2023-12-20 22:01:51
all they need is an API compatible client library so there is no actual switching cost between models other than configuration. There's a reason OpenAI is adding all sorts of add-on features like assistants and file upload, because they know models themselves are going to be a commodity and they need something to lock developers on their platform
replies(1): >>visarg+1N
2. visarg+1N[view] [source] 2023-12-21 05:23:32
>>ren_en+(OP)
Code execution and RAG are not going to lock people in. They are 1000x easier to replicate than the model, which as you say, is already becoming a commodity.

My pet theory is that OpenAI are cooking high quality user data by empowering GPT with all these toys + human-in-the-loop. The purpose is to use this data as a sort of continual evaluation sifting for weak points and enhancing their fine-tuning datasets.

Every human response can carry positive or negative connotation. The model can use that as a reward signal. They claimed to have 100M users, times let's say 10K tokens per month makes 1T synthetic tokens. In a whole year they generate about as much text as the original dataset, 13T. And we know that LLMs can benefit a lot from synthetic data when it is filtered/engineered for quality.

So I think OpenAI's moat is the data they generate.

[go to top]