zlacker

I wonder how this will all workout in the end (and the excitement around all of this is a little reminiscent of AOL bying Time Warner).

For one, I'm not sure Sam Altman will tolerate MS bureaucracy for very long.

But secondly, the new MS-AI entity can't presumably just take from OpenAI what they did there, they need to make it again.

This takes a lot of resources (that MS has) but also a lot of time to provide feedback to the models; also, copyright issues regarding source materials are more sensitive today, and people are more attuned to them: Microsoft will have a harder time playing fast and lose with that today, than OpenAI 8 years ago.

Or, Sam at MS becomes OpenAI biggest customer? But in that case, what are all those researchers and top scientists that followed him there, going to do?

Interesting times in any case.

replies(2): >>Michae+v5 >>sander+Of

>>bambax+(OP)
I think you overestimate the technical part. Just speculating (no inside, no expert), but I would assume that the models are pretty "easy" and can be coded in few days. There are for sure some tweaks to the standard transformer architecture, but guess the tweaks are well known to sam and co.

The dataset is more challenging, but here msft can help - since they have bing and github as well. So they might be able to make few shortcuts here.

The most time consuming part is compute, but here again msft has the compute.

Will they beat chat-gpt 4 in a year? Guess no. But they will come very close to it and maybe it would not matter that much if you focus on the product.

replies(1): >>duhast+t9

>>Michae+v5
You lost me at "can be coded in few days".

replies(1): >>Michae+bX

>>bambax+(OP)
Altman reporting to Nadella is certainly going to be a fascinating political struggle!

Part of me thinks that Nadella, having already demonstrated his mastery over all his competitor CEOs with one deft move after another over the past few years, took this on because he needed a new challenge.

I'd wager Altman will either get sidelined and pushed out, or become Nadella's successor, over the course of the next decade or so.

It's an interesting time!

>>duhast+t9
Haha, agree, it would take longer for sure.

What I meant is, most likely assuming that you are using pytorch / jax you could code down the model pretty fast. Just compare it to llama, sure it is far behind, but the llama model is under 1000 lines of code and pretty good.

There is tons of work, for the training, infra, preparing the data and so on. That would result guess in millions lines of code. But the core ideas and the model are likely thin I would argue. So that is my point.