zlacker

> Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.

Superintelligence that can be always ensured to have the same values and ethics as current humans, is not a superintelligence or likely even a human level intelligence (I bet humans 100 years from now will see the world significantly different than we do now).

Superalignment is an oxymoron.

replies(1): >>thorum+b4

>>RcouF1+(OP)
You might be interested in how CEV, one framework proposed for superalignment, addresses that concern:

https://en.wikipedia.org/wiki/Friendly_artificial_intelligen...

> our coherent extrapolated volition is "our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted (…) The appeal to an objective through contingent human nature (perhaps expressed, for mathematical purposes, in the form of a utility function or other decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the meta-ethical problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.

replies(2): >>wruza+sj >>juped+sG

>>thorum+b4
Is there an insightful summary of this proposal? The whole paper looks like 38 pages of non-rigorous prose with no clear procedure and already “aligned” LLMs will likely fail to analyze it.

Forced myself through some parts of it and all I can get is people don’t know what they want so it would be nice to build an oracle. Yeah, I guess.

replies(2): >>comp_t+Fk >>Likely+CI

>>wruza+sj
It's not a proposal with a detailed implementation spec, it's a problem statement.

replies(1): >>wruza+Po

>>comp_t+Fk
“One framework proposed for superalignment” sounded like it does something. Or maybe I missed the context.

>>thorum+b4
You keep posting this link to vague alignment copium from decades ago; we've come a long way in cynicism since then.

>>wruza+sj
Yudkowsky is a human LLM: his output is correctly semantically formed to appear, to a non-specialist, to fall into the subject domain, as a non-specialist would think the subject domain should appear, and so the non-specialist accepts it, but upon closer examination it's all word salad by something that clearly lacks understanding of both technological and philosophical concepts.

That so many people in the AI safety "community" consider him a domain expert has more to say with how pseudo-scientific that field is than his actual credentials as a serious thinker.

replies(1): >>wruza+Gx1

>>Likely+CI
Thanks, this explains the feeling I had after reading it (but was too shy to express).