zlacker

That's not what OpenAI is doing.

Their entire alignment effort is focused on avoiding the following existential threats:

1. saying bad words 2. hurting feelings 3. giving legal or medical advice

And even there, all they're doing is censoring the interface layer, not the model itself.

Nobody there gives a shit about reducing the odds of creating a paperclip maximizer or grey goo inventor.

I think the best we can hope for with OpenAI's safety effort is that the self-replicating nanobots it creates will disassemble white and asian cis-men first, because equity is a core "safety" value of OpenAI.