zlacker

Is there an insightful summary of this proposal? The whole paper looks like 38 pages of non-rigorous prose with no clear procedure and already “aligned” LLMs will likely fail to analyze it.

Forced myself through some parts of it and all I can get is people don’t know what they want so it would be nice to build an oracle. Yeah, I guess.

replies(2): >>comp_t+d1 >>Likely+ap

>>wruza+(OP)
It's not a proposal with a detailed implementation spec, it's a problem statement.

replies(1): >>wruza+n5

>>comp_t+d1
“One framework proposed for superalignment” sounded like it does something. Or maybe I missed the context.

>>wruza+(OP)
Yudkowsky is a human LLM: his output is correctly semantically formed to appear, to a non-specialist, to fall into the subject domain, as a non-specialist would think the subject domain should appear, and so the non-specialist accepts it, but upon closer examination it's all word salad by something that clearly lacks understanding of both technological and philosophical concepts.

That so many people in the AI safety "community" consider him a domain expert has more to say with how pseudo-scientific that field is than his actual credentials as a serious thinker.

replies(1): >>wruza+ee1

>>Likely+ap
Thanks, this explains the feeling I had after reading it (but was too shy to express).