It's uncool to look like an alarmist nut, but sometimes there's no socially acceptable alarm and the risks are real: https://intelligence.org/2017/10/13/fire-alarm/
It's worth looking at the underlying arguments earnestly, you can with an initial skepticism but I was persuaded. Alignment is also been something MIRI and others have been worried about since as early as 2007 (maybe earlier?) so it's also a case of a called shot, not a recent reaction to hype/new LLM capability.
Others have also changed their mind when they looked, for example:
- https://twitter.com/repligate/status/1676507258954416128?s=2...
- Longer form: https://www.lesswrong.com/posts/kAmgdEjq2eYQkB5PP/douglas-ho...
For a longer podcast introduction to the ideas: https://www.samharris.org/podcasts/making-sense-episodes/116...
There's also a whole map-territory problem where we're still pretending the distinction hasn't collapsed, Baudrillard-style. As if we weren't all obsessed with "prompt engineering" (whereby the machine trains us).
Out of the options to reduce that risk I think it would really take something like this, which also seems extremely unlikely to actually happen given the coordination problem: https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-no...
You talk about aligned agents - but there aren't any today and we don't know how to make them. It wouldn't be aligned agents vs. unaligned, it's only unaligned.
I don't think spreading out the tech reduces the risk. Spreading out nuclear weapons doesn't reduce the risk (and with nukes at least it's a lot easier to control the fissionable materials). Even with nukes you can still create them and decide not to use them, not so true with superintelligent AGI.
If anyone could have made nukes from their computer humanity may not have made it.
I'm glad OpenAI understands the severity of the problem though and is at least trying to solve it in time.
[1] https://drive.google.com/file/d/1rdG5QCTqSXNaJZrYMxO9x2ChsPB...
This part in particular caught my eye: "Other assumptions could also break down in the future, like favorable generalization properties during deployment". There have been actual experiments in which AIs appeared to successfully learn their objective in training, and then did something unexpected when released into a broader environment.[1]
I've seen some leading AI researchers dismiss alignment concerns, but without actually engaging with the arguments at all. I've seen no serious rebuttals that actually address the things the alignment people are concerned about.
Where is your evidence that we're approaching human level AGI, let alone SuperIntelligence? Because ChatGPT can (sometimes) approximate sophisticated conversation and deep knowledge?
How about some evidence that ChatGPT isn't even close? Just clone and run OpenAI's own evals repo https://github.com/openai/evals on the GPT-4 API.
It performs terribly on novel logic puzzles and exercises that a clever child could learn to do in an afternoon (there are some good chess evals, and I submitted one asking it to simulate a Forth machine).
https://tidybot.cs.princeton.edu/ https://innermonologue.github.io/
https://www.microsoft.com/en-us/research/group/autonomous-sy...
The alignment problem will come up when the robot control system notices that the guy with the stick is interfering with the robot's goals.
And he wrote about the risk in 2015 months before OpenAI was founded: https://blog.samaltman.com/machine-intelligence-part-1 https://blog.samaltman.com/machine-intelligence-part-2
Fine if you disagree with his arguments, but why assume you know what his motivation is?
E.g. by Yoshua Bengio: https://yoshuabengio.org/2023/06/24/faq-on-catastrophic-ai-r...
also it knows when to use a calculator if it has access to one so it's not a big deal
https://tidybot.cs.princeton.edu/
https://innermonologue.github.io/
https://www.microsoft.com/en-us/research/group/autonomous-sy...
I hadn’t encountered Pascal’s mugging (https://en.wikipedia.org/wiki/Pascal%27s_mugging) before and the premise is indeed pretty apt. I think I’m on the side that it’s not, assuming the idea is that it’s a Very Low Chance of a Very Bad Thing -- the “muggee” wants to give their wallet on the chance of the VBT because of the magnitude of its effect. It seems like there’s a rather high chance if (proverbially) the AI-cat is let out of the bag.
But maybe some Mass Effect nonsense will happen if we develop AGI and we’ll be approached by The Intergalactic Community and have our technology advanced millennia overnight. (Sorry, that’s tongue-in-cheek but it does kinda read like Pascal’s mugging in the opposite direction; however, that’s not really what most researchers are arguing.)
https://www.psy.ox.ac.uk/news/the-brain-is-a-prediction-mach...
How do you stop a crazy AI? You turn it off.
Pout pleas. Keep it preying about fantasy bogeyman instead of actual harms today, and never EVER question why.
[0] >>36038681
If you don't know who they are, then well I guess that makes sense.
If you do know who they are and your confidence is wavering then [0] is a great place to get started understanding the alignment problem.
OAI is a great place to work and the team is hiring for engineers and scientists.
[0] https://80000hours.org/problem-profiles/artificial-intellige...
[0]: No Physical Substrate, No Problem https://slatestarcodex.com/2015/04/07/no-physical-substrate-...
[1]: It Looks Like You're Trying To Take Over The World https://gwern.net/fiction/clippy
1: https://en.m.wikipedia.org/wiki/Energy_usage_of_the_United_S...
Very few people are actually alarmed about the right issues (in no particular order): population size, industrial pollution, military-industrial complex, for-profit multi-national corporations, digital surveillance, factory farming, global warming, &etc. This is why the alarmism from the AI crowd seems disingenuous because AI progress is simply an extension of for-profit corporatism and exploitation applied to digital resources and to properly address the risk from AI would require addressing the actual root causes of why technological progress is misaligned with human values.
1: https://www.theguardian.com/world/2015/jul/24/france-big-bro...
That the origin of COVID is even a question implies we have the tech to do it artificially. An AI today treating real life as that game would be self-destructive, but that doesn't mean it won't happen (reference classes: insanity, cancer).
If the AI can invent and order a von Neumann probe — the first part is the hard part, custom parts orders over the internet is already a thing — that it can upload itself to, then it can block out (and start disassembling) the sun in a matter of decades with reasonable-looking reproduction rates (though obviously we're guessing what "reasonable" looks like as we have only organic VN machines to frame the question against).
AI taking over brain implants and turning against everyone without them like a zombie war (potentially Neuralink depending on how secure the software is, and also a plot device in web fiction serial The Deathworlders, futuristic sci-fi and you may not be OK with sci-fi as a way to explore hypotheticals, but I think it's the only way until we get moon-sized telescopes to watch such things play out on other worlds without going there; in that story the same AI genocides multiple species over millions of years as an excuse for why humans can even take part in the events of the story).
It’s not an anti-goal that’s intentionally set, it’s that complex goal setting is hard and you may end up with something dumb that maximizes the reward unintentionally.
The issue is all of the AGIs will be unaligned in different ways because we don’t know how to align any of them. Also, the first to be able to improve itself in pursuit of its goal could take off at some threshold and then the others would not be relevant.
There’s a lot of thoughtful writing that exists on this topic and it’s really worth digging into the state of the art about it, your replies are thoughtful so it sounds like something you’d think about. I did the same thing a few years ago (around 2015) and found the arguments persuasive.
This is a decent overview: https://www.samharris.org/podcasts/making-sense-episodes/116...
I’m also not a moral relativist, I don’t think all values are equivalent, but you don’t even need to go there - before that point a lot of what humans want is not controversial and the “obvious” cases are not so obvious or easy to classify.
https://www.acsh.org/news/2018/04/17/bee-apocalypse-was-neve...
Mass starvation wasn't "addressed" exactly, because the predictions were for mass starvation in the west, which never happened. Also the people who predicted this weren't the ones who created the Green Revolution.
Ozone hole is I think the most valid example in the list, but who knows, maybe that was just BS too. A lot of scientific claims turn out to be so, these days, even those that were accepted for quite a while.
1: https://www.nationalgeographic.com/environment/article/plast...
If this is just a definitions issue, s/artificial intelligence/artificial cunning/g to the same effect.
Strength seems somewhat irrelevant either way, given the existence of Windows for Warships[0].
[0] not the real name: https://en.wikipedia.org/wiki/Submarine_Command_System
There's power and prestige in money, too, not just the positions.
Hence the lawyers who got in trouble for outsourcing themselves to ChatGPT: https://www.reuters.com/legal/new-york-lawyers-sanctioned-us...
Or those t-shirts from a decade back: https://money.cnn.com/2013/06/24/smallbusiness/tshirt-busine...
Bill Gates has bought up a bunch of farmland and I am certain he will use AI to manage them because manual allocation will be too inefficient[1].
1: https://www.popularmechanics.com/science/environment/a425435...