- Board is mostly independent and those independent dont have equity
- They talk about not being candid - this is legalese for “lying”
The only major thing that could warrant something like this is Sam going behind the boards back to make a decision (or make progress on a decision) that is misaligned with the Charter. Thats the only fireable offense that warrants this language.
My bet: Sam initiated some commercial agreement (like a sale) to an entity that would have violated the “open” nature of the company. Likely he pursued a sale to Microsoft without the board knowing.
Desperate times calls for desperate measures. This is a swift way for OpenAI to shield the business from something which is a PR disaster, probably something which would make Sam persona non grata in any business context.
It was abundantly obvious how he was using weasel language like "I'm very 'nervous' and a 'little bit scared' about what we've created [at OpenAI]" and other such BS. We know he was after "moat" and "regulatory capture", which we know where it all leads to — a net [long-term] loss for the society.
[1] >>35960125
Thank you. I don't see this expressed enough.
A true idealist would be committed to working on open models. Anyone who thinks Sam was in it for the good of humanity is falling for the same "I'm-rich-but-I-care" schtick pulled off by Elon, SBF, and others.
There is a perfectly sound idealistic argument for not publishing weights, and indeed most in the x-risk community take this position.
The basic idea is that AI is the opposite of software; if you publish a model with scary capabilities you can’t undo that action. Whereas with FOSS software, more eyes mean more bugs found and then everyone upgrades to a more secure version.
If OpenAI publishes GPT-5 weights, and later it turns out that a certain prompt structure unlocks capability gains to mis-aligned AGI, you can’t put that genie back in the bottle.
And indeed if you listen to Sam talk (eg on Lex’s podcast) this is the reasoning he uses.
Sure, plenty of reasons this could be a smokescreen, but wanted to push back on the idea that the position itself is somehow not compatible with idealism.
I'm not fully convinced, though...
> if you publish a model with scary capabilities you can’t undo that action.
This is true of conventional software, too! I can picture a politician or businessman from the 80s insisting that operating systems, compilers, and drivers should remain closed source because, in the wrong hands, they could be used to wreak havoc on national security. And they would be right about the second half of that! It's just that security-by-obscurity is never a solution. The bad guys will always get their hands on the tools, so the best thing to do is to give the tools to everyone and trust that there are more good guys than bad guys.
Now, I know AGI is different than convnetional software (I'm not convinced it's the "opposite", though). I accept that giving everyone access to weights may be worse than keeping them closed until they are well-aligned (whenever that is). But that would go against every instinct I have, so I'm inclined to believe that open is better :)
All that said, I think I would have less of an issue if it didn't seem like they were commandeering the term "open" from the volunteers and idealists in the FOSS world who popularized it. If a company called, idk, VirtuousAI wanted to keep their weights secret, OK. But OpenAI? Come on.
I take a less cynical view on the name; they were committed to open source in the beginning, and did open up their models IIUC. Then they realized the above, and changed path. At the same time, realizing they needed huge GPU clusters, and being purely non-profit would not enable that. Again I see why it rubs folks the wrong way, more so on this point.
But yes, if you just consider the mundane benefits and harms of AI, it looks a lot like crypto; it both benefits our economy and can be weaponized, including by our adversaries.
Anyway, the point is, obfuscation doesn't work to keep scary technology away.
And I'm not sure this allegedly-bagged cat has claws either - the current crop of LLMs are still clearly in a different category to "intelligence". It's pretty easy to see their limitations, and behave more like the fancy text predictors they are rather than something that can truly extrapolate, which is required for even the start of some AI sci-fi movie plot. Maybe continued development and research along that path will lead to more capabilities, but we're certainly not there yet, and I'd suspect not particularly close.
Maybe they actually have some super secret internal stuff that fixes those flaws, and are working on making sure it's safe before releasing it. And maybe I have a dragon in my garage.
I generally feel hyperbolic language about such things to be damaging, as it makes it so easy to roll your eyes about something that's clearly false, and that can get inertia to when things develop to where things may actually need to be considered. LLMs are clearly not currently an "existential threat", and the biggest advantage to keeping it closed appears to be financial benefits in a competitive market. So it looks like a duck and quacks like a duck, but don't you understand I'm protecting you from this evil fire breathing dragon for your own good!
It smells of some fantasy gnostic tech wizard, where only those who are smart enough to figure out the spell themselves are truly smart enough to know how to use it responsibly. And who doesn't want to think of themselves as smart? But that doesn't seem to match similar things in the real world - like the Manhattan project - many of the people developing it were rather gung-ho with proposals for various uses, and even if some publicly said it was possibly a mistake post-fact, they still did it. Meaning their "smarts" on how to use it came too late.
And as you pointed out, nuclear weapon control by limiting information has already failed. If north Korea can develop them, one of the least connected nations in the world, surely anyone with the required resources can. The only limit today seems the cost to nations, and how relatively obvious the large infrastructure around it seems to be, allowing international pressure before things get into to the "stockpiling usable weapons" stage.
I think timelines are important here; for example in 2015 there was no such thing as Transformers, and while there were AGI x-risk folks (e.g. MIRI) they were generally considered to be quite kooky. I think AGI was very credibly "cat in the bag" at this time; it doesn't happen without 1000s of man-years of focused R&D that only a few companies can even move the frontier on.
I don't think the claim should be "we could have prevented LLMs from ever being invented", just that we can perhaps delay it long enough to be safe(r). To bring it back to the original thread, Sam Altman's explicit position is that in the matrix of "slow vs fast takeoff" vs. "starting sooner vs. later", a slow takeoff starting sooner is the safest choice. The reasoning being, you would prefer a slow takeoff starting later, but the thing that is most likely to kill everyone is a fast takeoff, and if you try for a slow takeoff later, you might end up with a capability overhang and accidentally get a fast takeoff later. As we can see, it takes society (and government) years to catch up to what is going on, so we don't want anything to happen quicker than we can react to.
A great example of this overhang dynamic would be Transformers circa 2018 -- Google was working on LLMs internally, but didn't know how to use them to their full capability. With GPT (and particularly after Stable Diffusion and LLaMA) we saw a massive explosion in capability-per-compute for AI as the broader community optimized both prompting techniques (e.g. "think step by step", Chain of Thought) and underlying algorithmic/architectural approaches.
At this time it seems to me that widely releasing LLMs has both i) caused a big capability overhang to be harvested, preventing it from contributing to a fast takeoff later, and ii) caused OOMs more resources to be invested in pushing the capability frontier, making the takeoff trajectory overall faster. Both of those likely would not have happened for at least a couple years if OpenAI didn't release ChatGPT when they did. It's hard for me to calculate whether on net this brings dangerous capability levels closer, but I think there's a good argument that it makes the timeline much more predictable (we're now capped by global GPU production), and therefore reduces tail-risk of the "accidental unaligned AGI in Google's datacenter that can grab lots more compute from other datacenters" type of scenario (aka "foom").
> LLMs are clearly not currently an "existential threat"
Nobody is claiming (at least, nobody credible in the x-risk community is claiming) that GPT-4 is an existential threat. The claim is, looking at the trajectory, and predicting where we'll be in 5-10 years; GPT-10 could be very scary, so we should make sure we're prepared for it -- and slow down now if we think we don't have time to build GPT-10 safely on our current trajectory. Every exponential curve flattens into an S-curve eventually, but I don't see a particular reason to posit that this one will be exhausted before human-level intelligence, quite the opposite. And if we don't solve fundamental problems like prompt-hijacking and figure out how to actually durably convey our values to an AI, it could be very bad news when we eventually build a system that is smarter than us.
While Eliezer Yudkowsky takes the maximally-pessimistic stance that AGI is by default ruinous unless we solve alignment, there are plenty of people who take a more epistemically humble position that we simply cannot know how it'll go. I view it as a coin toss as to whether an AGI directly descended from ChatGPT would stay aligned to our interests. Some view it as Russian roulette. But the point being, would you play Russian roulette with all of humanity? Or wait until you can be sure the risk is lower?
I think it's plausible that with a bit more research we can crack Mechanistic Interpretability and get to a point where, for example, we can quantify to what extent an AI is deceiving us (ChatGPT already does this in some situations), and to what extent it is actually using reasoning that maps to our values, vs. alien logic that does not preserve things humanity cares about when you give it power.
> nuclear weapon control by limiting information has already failed.
In some sense yes, but also, note that for almost 80 years we have prevented _most_ countries from learning this tech. Russia developed it on their own, and some countries were granted tech transfers or used espionage. But for the rest of the world, the cat is still in the bag. I think you can make a good analogy here: if there is an arms race, then superpowers will build the technology to maintain their balance of power. If everybody agrees not to build it, then perhaps there won't be a race. (I'm extremely pessimistic for this level of coordination though.)
Even with the dramatic geopolitical power granted by possessing nuclear weapons, we have managed to pursue a "security through obscurity" regime, and it has worked to prevent further spread of nuclear weapons. This is why I find the software-centric "security by obscurity never works" stance to be myopic. It is usually true in the software security domain, but it's not some universal law.