zlacker

I appreciate your take. I didn't know that was his stated reasoning, so that's good to know.

I'm not fully convinced, though...

> if you publish a model with scary capabilities you can’t undo that action.

This is true of conventional software, too! I can picture a politician or businessman from the 80s insisting that operating systems, compilers, and drivers should remain closed source because, in the wrong hands, they could be used to wreak havoc on national security. And they would be right about the second half of that! It's just that security-by-obscurity is never a solution. The bad guys will always get their hands on the tools, so the best thing to do is to give the tools to everyone and trust that there are more good guys than bad guys.

Now, I know AGI is different than convnetional software (I'm not convinced it's the "opposite", though). I accept that giving everyone access to weights may be worse than keeping them closed until they are well-aligned (whenever that is). But that would go against every instinct I have, so I'm inclined to believe that open is better :)

All that said, I think I would have less of an issue if it didn't seem like they were commandeering the term "open" from the volunteers and idealists in the FOSS world who popularized it. If a company called, idk, VirtuousAI wanted to keep their weights secret, OK. But OpenAI? Come on.

replies(1): >>thepti+Ad

>>kdmcco+(OP)
The analogy would be publishing designs for nuclear weapons, or a bioweapon; hard-to-obtain capabilities that are effectively impossible for adversaries to obtain are treated very differently than vulns that a motivated teenager can find. To be clear we are talking about (hypothetical) civilization-ending risks, which I don’t think software has ever credibly risked.

I take a less cynical view on the name; they were committed to open source in the beginning, and did open up their models IIUC. Then they realized the above, and changed path. At the same time, realizing they needed huge GPU clusters, and being purely non-profit would not enable that. Again I see why it rubs folks the wrong way, more so on this point.

replies(2): >>kimixa+Ir >>JohnFe+eu1

>>thepti+Ad
Another analogy would be cryptographic software - it was classed as a munition and people said similar things about the danger of it getting out to "The Bad Guys"

replies(2): >>thepti+yy >>oooyay+rA

>>kimixa+Ir
Again, my reference class is “things that could end civilization”, which I hope we can all agree was not the claim about crypto.

But yes, if you just consider the mundane benefits and harms of AI, it looks a lot like crypto; it both benefits our economy and can be weaponized, including by our adversaries.

replies(1): >>sudosy+qA

>>thepti+yy
Well, just like nuclear weapons, eventually the cat is out of the bag, and you can't really stop people from making them anymore. Except that, obviously, it's much easier to train an LLM than to enrich uranium. It's not a secret you can keep for long - after all it only took, what, 3 years for the Soviets to catch up to fission weapons, and then only 8 months to catch up to fusion weapons (arguably beating the US to the bunch of the first weaponizable fusion design)

Anyway, the point is, obfuscation doesn't work to keep scary technology away.

replies(2): >>kimixa+RU >>thepti+TA2

>>kimixa+Ir
You used past tense, but that is the present. Embargoes from various countries include cryptographic capabilities, including open source ones, for this reason. It's not unfounded, but a world without personal cryptography is not sustainable as technology advances. People before computers were used to some level of anonymity and confidentiality that you cannot get in the modern world without cryptography.

>>sudosy+qA
I'm not sure the cat was ever in the bag for LLMs. Every big player has their own flavor now, and it seems the reason why I don't have one myself is an issue of finances rather than secret knowledge. OpenAI's possible advantages seem to be more about scale and optimization rather than doing anything really different.

And I'm not sure this allegedly-bagged cat has claws either - the current crop of LLMs are still clearly in a different category to "intelligence". It's pretty easy to see their limitations, and behave more like the fancy text predictors they are rather than something that can truly extrapolate, which is required for even the start of some AI sci-fi movie plot. Maybe continued development and research along that path will lead to more capabilities, but we're certainly not there yet, and I'd suspect not particularly close.

Maybe they actually have some super secret internal stuff that fixes those flaws, and are working on making sure it's safe before releasing it. And maybe I have a dragon in my garage.

I generally feel hyperbolic language about such things to be damaging, as it makes it so easy to roll your eyes about something that's clearly false, and that can get inertia to when things develop to where things may actually need to be considered. LLMs are clearly not currently an "existential threat", and the biggest advantage to keeping it closed appears to be financial benefits in a competitive market. So it looks like a duck and quacks like a duck, but don't you understand I'm protecting you from this evil fire breathing dragon for your own good!

It smells of some fantasy gnostic tech wizard, where only those who are smart enough to figure out the spell themselves are truly smart enough to know how to use it responsibly. And who doesn't want to think of themselves as smart? But that doesn't seem to match similar things in the real world - like the Manhattan project - many of the people developing it were rather gung-ho with proposals for various uses, and even if some publicly said it was possibly a mistake post-fact, they still did it. Meaning their "smarts" on how to use it came too late.

And as you pointed out, nuclear weapon control by limiting information has already failed. If north Korea can develop them, one of the least connected nations in the world, surely anyone with the required resources can. The only limit today seems the cost to nations, and how relatively obvious the large infrastructure around it seems to be, allowing international pressure before things get into to the "stockpiling usable weapons" stage.

replies(1): >>thepti+Io2

>>thepti+Ad
If you really think that what you're working on poses an existential risk to humanity, continuing to work on it puts you squarely in "supervillian" territory. Making it closed source and talking about "AI safety" doesn't change that.

>>kimixa+RU
> I'm not sure the cat was ever in the bag for LLMs.

I think timelines are important here; for example in 2015 there was no such thing as Transformers, and while there were AGI x-risk folks (e.g. MIRI) they were generally considered to be quite kooky. I think AGI was very credibly "cat in the bag" at this time; it doesn't happen without 1000s of man-years of focused R&D that only a few companies can even move the frontier on.

I don't think the claim should be "we could have prevented LLMs from ever being invented", just that we can perhaps delay it long enough to be safe(r). To bring it back to the original thread, Sam Altman's explicit position is that in the matrix of "slow vs fast takeoff" vs. "starting sooner vs. later", a slow takeoff starting sooner is the safest choice. The reasoning being, you would prefer a slow takeoff starting later, but the thing that is most likely to kill everyone is a fast takeoff, and if you try for a slow takeoff later, you might end up with a capability overhang and accidentally get a fast takeoff later. As we can see, it takes society (and government) years to catch up to what is going on, so we don't want anything to happen quicker than we can react to.

A great example of this overhang dynamic would be Transformers circa 2018 -- Google was working on LLMs internally, but didn't know how to use them to their full capability. With GPT (and particularly after Stable Diffusion and LLaMA) we saw a massive explosion in capability-per-compute for AI as the broader community optimized both prompting techniques (e.g. "think step by step", Chain of Thought) and underlying algorithmic/architectural approaches.

At this time it seems to me that widely releasing LLMs has both i) caused a big capability overhang to be harvested, preventing it from contributing to a fast takeoff later, and ii) caused OOMs more resources to be invested in pushing the capability frontier, making the takeoff trajectory overall faster. Both of those likely would not have happened for at least a couple years if OpenAI didn't release ChatGPT when they did. It's hard for me to calculate whether on net this brings dangerous capability levels closer, but I think there's a good argument that it makes the timeline much more predictable (we're now capped by global GPU production), and therefore reduces tail-risk of the "accidental unaligned AGI in Google's datacenter that can grab lots more compute from other datacenters" type of scenario (aka "foom").

> LLMs are clearly not currently an "existential threat"

Nobody is claiming (at least, nobody credible in the x-risk community is claiming) that GPT-4 is an existential threat. The claim is, looking at the trajectory, and predicting where we'll be in 5-10 years; GPT-10 could be very scary, so we should make sure we're prepared for it -- and slow down now if we think we don't have time to build GPT-10 safely on our current trajectory. Every exponential curve flattens into an S-curve eventually, but I don't see a particular reason to posit that this one will be exhausted before human-level intelligence, quite the opposite. And if we don't solve fundamental problems like prompt-hijacking and figure out how to actually durably convey our values to an AI, it could be very bad news when we eventually build a system that is smarter than us.

While Eliezer Yudkowsky takes the maximally-pessimistic stance that AGI is by default ruinous unless we solve alignment, there are plenty of people who take a more epistemically humble position that we simply cannot know how it'll go. I view it as a coin toss as to whether an AGI directly descended from ChatGPT would stay aligned to our interests. Some view it as Russian roulette. But the point being, would you play Russian roulette with all of humanity? Or wait until you can be sure the risk is lower?

I think it's plausible that with a bit more research we can crack Mechanistic Interpretability and get to a point where, for example, we can quantify to what extent an AI is deceiving us (ChatGPT already does this in some situations), and to what extent it is actually using reasoning that maps to our values, vs. alien logic that does not preserve things humanity cares about when you give it power.

> nuclear weapon control by limiting information has already failed.

In some sense yes, but also, note that for almost 80 years we have prevented _most_ countries from learning this tech. Russia developed it on their own, and some countries were granted tech transfers or used espionage. But for the rest of the world, the cat is still in the bag. I think you can make a good analogy here: if there is an arms race, then superpowers will build the technology to maintain their balance of power. If everybody agrees not to build it, then perhaps there won't be a race. (I'm extremely pessimistic for this level of coordination though.)

Even with the dramatic geopolitical power granted by possessing nuclear weapons, we have managed to pursue a "security through obscurity" regime, and it has worked to prevent further spread of nuclear weapons. This is why I find the software-centric "security by obscurity never works" stance to be myopic. It is usually true in the software security domain, but it's not some universal law.

>>sudosy+qA
> it's much easier to train an LLM than to enrich uranium.

I hadn't thought of this dichotomy before, but I'm not sure it's going to be true for long; I wouldn't be surprised if it turned out that obtaining the 50k H100s you need to train a GPT-5 (or whatever hardware investment it is) is harder for Iran than obtaining its centrifuges. If it's not true now, I expect it to be true within a hardware generation or two. (The US already has >=A100 embargoes on China, and I'd expect that to be strengthened to apply to Iran if it doesn't already, at least if they demonstrated any military interest in AI technology.)

Also, I don't think nuclear tech is an example against obfuscation; how many countries know how to make thermonuclear warheads? Seems to me that the obfuscation regime has been very effective, though certainly not perfect. It's backed with the carrot and stick of diplomacy and sanctions of course, but that same approach would also have to be used if you wanted to globally ban or restrict AI beyond a certain capability level.