zlacker

[parent] [thread] 8 comments
1. buffer+(OP)[view] [source] 2023-11-20 05:45:35
If they don't get Altman back, Altman starts a new company AkshuallyOpenAI and most talent moves there. They quickly get the funding and the contracts with MS. OpenAI is left in the dust.
replies(3): >>azurez+G >>selcuk+K >>kumarv+31
2. azurez+G[view] [source] 2023-11-20 05:50:00
>>buffer+(OP)
You know, we are dealing with people and people think differently so they won't move altogether. Moving to a new company under Alt is not an easy choice, at least the new company:

- Does not yet have a big model (need $$$ and months to train, if code is ready)

- Does not have proprietary code OpenAI has right now

- Does not have labeled data ($$$ and time) and chatgpt logs

- Does not have ChatGpt brand...

replies(1): >>buffer+k2
3. selcuk+K[view] [source] 2023-11-20 05:50:29
>>buffer+(OP)
> They quickly get the funding and the contracts with MS. OpenAI is left in the dust.

I haven't checked but I'm pretty sure OpenAI has many patents in the field and they won't be willing to share them with another company, especially with AkshuallyOpenAI.

4. kumarv+31[view] [source] 2023-11-20 05:51:55
>>buffer+(OP)
Yes, but data for the training, model development and other things requires a lot of time and investment, with OpenAI having a huge head start.

It is to be seen if investors will likely pour another set of billions of dollars, just to catch up to speed with OpenAI, which, by that time, would have even further evolved.

There is a ray of hope that, as it so happens in this field, that old things are quickly obsolete and new things are the cutting edge, Sam Altman can convince investors to invest in the cutting edge with him. Then investors have a choice on an almost level field, to choose between people, companies and personalities, for a given outcome.

replies(1): >>alsodu+R6
◧◩
5. buffer+k2[view] [source] [discussion] 2023-11-20 05:59:46
>>azurez+G
I thought GPT-4 was not trained on labeled data, but simply on a large volume of text / code. Most of it is publicly accessible: wikipedia, archives of scientific articles, books, github, plus probably purchased data from text-heavy sites like Reddit.
replies(3): >>enigmu+q3 >>lyu072+W4 >>frabcu+L5
◧◩◪
6. enigmu+q3[view] [source] [discussion] 2023-11-20 06:06:29
>>buffer+k2
Assuming it's a reference to RLHF? Not sure
◧◩◪
7. lyu072+W4[view] [source] [discussion] 2023-11-20 06:14:00
>>buffer+k2
No it's reinforcement learning with human feedback, RLHF lots of labeling
◧◩◪
8. frabcu+L5[view] [source] [discussion] 2023-11-20 06:19:17
>>buffer+k2
Whatever they've built this year presumably uses all the positive/negative feedback on ChatGPT that they have a year worth of data now...

Another examples is the Be My Eyes data - presumably the vision part of GPT-4 was trained on the archive of data the blind assistance app has, and that could be an exclusive deal with OpenAI.

◧◩
9. alsodu+R6[view] [source] [discussion] 2023-11-20 06:26:31
>>kumarv+31
You are definitely over estimating how much time and effort it needs to build large models.

Sam will get billions of dollars if he starts a new company. So there's no issue of money. In terms of data and training models, look at Anthropic - they did train a reasonable model. Heck look at Mistral, a bunch of ex Meta folks and their LLaMA team lead who spinned up a good models in months.

The only bottle neck i could think of would probably be RLHF data - but given enough money, that's not an issue either.

[go to top]