OpenAI departures: Why can’t former employees talk?

>>fnbr+(OP)
Extra respect is due to Jan Leike, then:

https://x.com/janleike/status/1791498174659715494

>>thorum+Bu
I think superalignment is absurd, and model "safety" is the modern AI company's "think of the children" pearl clutching pretext to justify digging moats. All this after sucking up everyone's copyright material as fair use, then not releasing the result, and profiting off it.

All due respect to Jan here, though. He's being (perhaps dangerously) honest, genuinely believes in AI safety, and is an actual research expert, unlike me.

>>a_wild+Xv
The superalignment team was not focused on that kind of “safety” AFAIK. According to the blog post announcing the team,

https://openai.com/index/introducing-superalignment/

> Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.

> While superintelligence seems far off now, we believe it could arrive this decade.

> Managing these risks will require, among other things, new institutions for governance and solving the problem of superintelligence alignment:

> How do we ensure AI systems much smarter than humans follow human intent?

> Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.

>>thorum+My
That doesn't really contradict what the other poster said. They're calling for regulation (digging a moat) to ensure systems are "safe" and "aligned" while ignoring that humans are not aligned, so these systems obviously cannot be aligned with humans; they can only be aligned with their owners (i.e. them, not you).

>>ndrisc+XA
Humans are not aligned with humans.

This is the most concise takedown of that particular branch of nonsense that I’ve seen so far.

Do we want woke AI, X brand fash-pilled AI, CCPBot, or Emirates Bot? The possibilities are endless.

>>api+pD
CEV is one possible answer to this question that has been proposed. Wikipedia has a good short explanation here:

https://en.wikipedia.org/wiki/Friendly_artificial_intelligen...

And here is a more detailed explanation:

https://intelligence.org/files/CEV.pdf

>>thorum+fF
I had to login because I haven’t seen anybody reference this in like a decade.

If I remember correctly the author unsuccessfully tried to get that purged from the Internet

>>Andrew+WK
You're thinking of something else (and "purged from the internet" isn't exactly an accurate account of that, either).

>>comp_t+sL
Hmm maybe I’m misremembering then

I do recall there was some recantation or otherwise distancing from CEV not long after he posted it, but frankly it was long ago enough that my memories might be getting mixed

What was the other one?

zlacker