All due respect to Jan here, though. He's being (perhaps dangerously) honest, genuinely believes in AI safety, and is an actual research expert, unlike me.
OpenAI made a large commitment to super-alignment in the not-so-distant past. I beleive mid-2023. Famously, it has always taken AI Safety™ very seriously.
Regardless of anyone's feelings on the need for a dedicated team for it, you can chalk to one up as another instance of OpenAI cough leadership cough speaking out of both sides of it's mouth as is convenient. The only true north star is fame, glory, and user count, dressed up as humble "research"
To really stress this: OpenAI's still-present cofounder shared yesterday on a podcast that they expect AGI in ~2 years and ASI (superpassing human intelligence) by end of the decade.
What's his track record on promises/predictions of this sort? I wasn't paying attention until pretty recently.
Link? Is the ~2 year timeline a common estimate in the field?
https://openai.com/index/introducing-superalignment/
> Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.
> While superintelligence seems far off now, we believe it could arrive this decade.
> Managing these risks will require, among other things, new institutions for governance and solving the problem of superintelligence alignment:
> How do we ensure AI systems much smarter than humans follow human intent?
> Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.
That programme aired in the 1980's. Other than vested promises is there much to indicate it's close at all? Empty promises aside there isn't really any indication of that being likely at all.
> I don't think it's going to happen next year, it's still useful to have the conversation and maybe it's like two or three years instead.
This doesn't seem like a super definite prediction. The "two or three" might have just been a hypothetical.
Superintelligence that can be always ensured to have the same values and ethics as current humans, is not a superintelligence or likely even a human level intelligence (I bet humans 100 years from now will see the world significantly different than we do now).
Superalignment is an oxymoron.
Humans are used to ordering around other humans who would bring common sense and laziness to the table and probably not grind up humans to produce a few more paperclips.
Alignment is about getting the AGI to be aligned with the owners, ignoring it means potentially putting more and more power into the hands of a box that you aren't quite sure is going to do the thing you want it to do. Alignment in the context of AGIs was always about ensuring the owners could control the AGIs not that the AGIs could solve philosophy and get all of humanity to agree.
AI experts who aren't riding the hype train and getting high off of its fumes acknowledge that true AI is something we'll likely not see in our lifetimes.
> Whoa whoa whoa, we can't let just anyone run these models. Only large corporations who will use them to addict children to their phones and give them eating disorders and suicidal ideation, while radicalizing adults and tearing apart society using the vast profiles they've collected on everyone through their global panopticon, all in the name of making people unhappy so that it's easier to sell them more crap they don't need (a goal which is itself a problem in the face of an impending climate crisis). After all, we wouldn't want it to end up harming humanity by using its superior capabilities to manipulate humans into doing things for it to optimize for goals that no one wants!
This is the most concise takedown of that particular branch of nonsense that I’ve seen so far.
Do we want woke AI, X brand fash-pilled AI, CCPBot, or Emirates Bot? The possibilities are endless.
I suspect there will be at least continued commercial use of the current tech, though I still suspect this crop is another dead end in the hunt for AGI.
They got completely outsmarted and out maneuvered by Sam Altman
And they think they will be able to align a super human intelligence? That it won’t outsmart and out maneuver them easier than Sam Altman did.
They are deluded!
https://en.wikipedia.org/wiki/Friendly_artificial_intelligen...
And here is a more detailed explanation:
https://en.wikipedia.org/wiki/Friendly_artificial_intelligen...
> our coherent extrapolated volition is "our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted (…) The appeal to an objective through contingent human nature (perhaps expressed, for mathematical purposes, in the form of a utility function or other decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the meta-ethical problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.
Personally I'm not seeing that the path we're on leads to whatever that is, either. But I think/hope I'll know if I'm wrong when it's in front of me.
If I remember correctly the author unsuccessfully tried to get that purged from the Internet
Care to explain? Absurd how? An internal contradiction somehow? Unimportant for some reason? Impossible for some reason?
How can I be confident you aren't committing the fallacy of collecting a bunch of events and saying that is sufficient to serve as a cohesive explanation? No offense intended, but the comment above has many of the qualities of a classic rant.
If I'm wrong, perhaps you could elaborate? If I'm not wrong, maybe you could reconsider?
Don't forget that alignment research has existed longer than OpenAI. It would be a stretch to claim that the original AI safety researchers were using the pretexts you described -- I think it is fair to say they were involved because of genuine concern, not because it was a trendy or self-serving thing to do.
Some of those researchers and people they influenced ended up at OpenAI. So it would be a mistake or at least an oversimplification to claim that AI safety is some kind of pretext at OpenAI. Could it be a pretext for some people in the organization, to some degree? Sure, it could. But is it a significant effect? One that fits your complex narrative, above? I find that unlikely.
Making sense of an organization's intentions requires a lot of analysis and care, due to the combination of actors and varying influence.
There are simpler, more likely explanations, such as: AI safety wasn't a profit center, and over time other departments in OpenAI got more staff, more influence, and so on. This is a problem, for sure, but there is no "pearl clutching pretext" needed for this explanation.
That’s neither efficient nor optimized, just a bogeyman for “doesn’t work”.
Forced myself through some parts of it and all I can get is people don’t know what they want so it would be nice to build an oracle. Yeah, I guess.
Which is why creating a new type of intelligent entity that could be more powerful than humans is a very bad idea: we don't even know how to align the humans and we have a ton of experience with them
TL;DR train a seed AI to guess what humans would want if they were "better" and do that.
It's better and quicker search at present for the area I specialise in.
It's not currently even close to being a x2 multiplier for me, it possibly even a negative impact, probably not but I'm still exploring. Which feels detached from the promises. Interesting but at present more hype than hyper. Also, it's energy inefficient so cost heavy. I feel that will likely cripple a lot of use cases.
What's your take?
That so many people in the AI safety "community" consider him a domain expert has more to say with how pseudo-scientific that field is than his actual credentials as a serious thinker.
Of course destroying the planet to get iron from its core is not a popular agi-doomer analogy, as that sounds a bit too human-like behaviour.
We just got sick of it because it sucks.
A genuinely sentient AI isn’t going to want some cybernetic equivalent of that shit either. Doing that is how you get angry Skynet.
I’m not sure alignment is the right goal. I’m not sure it’s even good. Monoculture is weak and stifling and sets itself against free will. Peaceful coexistence and trade under a social contract of mutual benefit is the right goal. The question is whether it’s possible to extend that beyond Homo sapiens.
If the lefties can have their pronouns and the rednecks can shoot their guns can the basilisk build its Dyson swarm? The universe is physically large enough if we can agree to not all be the same and be fine with that.
I think we have a while to figure it out. These things are just lossy compressed blobs of queryable data so far. They have no independent will or self reflection and I’m not sure we have any idea how to do that. We’re not even sure it’s possible in a digital deterministic medium.
Are you saying these so-called simple intentions are the only factors in play? Surely not.
Are you putting forth a theory that we can test? How well do you think your theory works? Did it work for Enron? For Microsoft? For REI? Does it work for every organization? Surely not perfectly; therefore, it can't be as simple as you claim.
Making a simplification and calling it "simple" is an easy thing to do.
I do recall there was some recantation or otherwise distancing from CEV not long after he posted it, but frankly it was long ago enough that my memories might be getting mixed
What was the other one?
Of course, I hope to be uploaded to the WIP dyson swarm around the sun at this point.
(Doomers are, broadly, singularitarians who went "wait, hold on actually.")
Can the Etoro practice child buggery and the Spartans infanticide and the Canadians abortion? Can the modern Germans stop siblings reared apart from having sex and the Germans from 80 years stop the disabled having sex? Can the Americans practice circumcision and the Somali's FGM?
Libertarianism is all well and good in theory, except no one can agree quite where the other guy's nose ends or even who counts as a person.
Incredulous reactions don't aid whatever you intend to communicate - there's a reason why everyone knows what AI the last 12 months, it's not made up or a monoculture. It would be very odd to expect discontinuation of commercial use without a black swan event
It’s really a pretty narrow spectrum of behaviors: killing, imprisoning, robbing, various types of bodily autonomy violation. There are some edge cases and human specific things in there but not a lot. Most of them have to do with sex which is a peculiarly human thing anyway. I don’t think we are getting creepy perv AIs (unless we train them on 4chan and Urban Dictionary).
My point isn’t that there are no possible areas of conflict. My point is that I don’t think you need a huge amount of alignment if alignment implies sameness. You just need to deal with the points of conflict which do occur which are actually a very small and limited subset of available behaviors.
Humans have literally billions of customs and behaviors that don’t get anywhere near any of that stuff. You don’t need to even care about the vast majority of the behavior space.