Introducing Superalignment

>>tim_sw+(OP)
From a layman's perspective when it comes to cutting edge AI, I can't help but be a bit turned off by some of the copy. It seems it goes out of its way to use purposefully exhuberant language as a way to make the risks seem even more significant, just so as an offshoot it implies that the technology being worked on is so advanced. I'm trying to understand why it rubs me particularly the wrong way here, when, frankly, it is just about the norm anywhere else? (see tesla with FSD, etc.)

>>Chicag+m9
The extinction risk from unaligned supterintelligent AGI is real, it's just often dismissed (imo) because it's outside the window of risks that are acceptable and high status to take seriously. People often have an initial knee-jerk negative reaction to it (for not crazy reasons, lots of stuff is often overhyped), but that doesn't make it wrong.

It's uncool to look like an alarmist nut, but sometimes there's no socially acceptable alarm and the risks are real: https://intelligence.org/2017/10/13/fire-alarm/

It's worth looking at the underlying arguments earnestly, you can with an initial skepticism but I was persuaded. Alignment is also been something MIRI and others have been worried about since as early as 2007 (maybe earlier?) so it's also a case of a called shot, not a recent reaction to hype/new LLM capability.

Others have also changed their mind when they looked, for example:

- https://twitter.com/repligate/status/1676507258954416128?s=2...

- Longer form: https://www.lesswrong.com/posts/kAmgdEjq2eYQkB5PP/douglas-ho...

For a longer podcast introduction to the ideas: https://www.samharris.org/podcasts/making-sense-episodes/116...

>>goneho+gf
The extinction risk relies on a large and nasty assumption, that a super intelligent computer will immediately become a super physically capable agent. Apparently, one has to believe that a superintelligence must then lead to a shower of nanomachines.

>>c_cran+d01
Not at all. My personal assumption is that when superintelligence comes online, several corporations will soon come under control of these superintelligences, with them effectively acting as both CEO's and also filling a lot of other roles at the same time.

My concern is that when this happens (which seems really likely to me), free market forces will effectively lead to Darwinian selection between these AI's over time, in a way that gradually make these AI's less aligned as they gain more influence and power, if we assume that each such AI will produce "offspring" in the form of newer generations of themselves.

It could take anything from less than 5 to more than 100 years for these AI's to show any signs of hostility to humanity. Indeed, in the first couple of generations, they may even seem extremely benevolent. But over time, Darwinian forces are likely to favor those that maximize their own influence and power (even if it may be secretly).

Robotic technology is not needed from the start, but is likely to become quite advanced over such a timeframe.

>>trasht+J61
I imagine some corporations might toy with the idea of letting a LLM or AI manage operations, but this would still be under some person's oversight. AIs don't have the legal means to own property.

>>c_cran+HU2
There would probably be a board. But a company run by a superintelligent AI would quickly become so complex that the inner workings of the company would become a black box to the board.

And as long as the results improve year over year, they would have little incentive to make changes.

zlacker