zlacker

AI Safety means a lot of different things.

Inward safety for people means their vulnerabilities are not exposed and/or open to attack. Outward safety means they are not attacking others and looking out for the general wellbeing of others. We have a lot of different social constructs with other people of which we attempt to keep this in balance. It doesn't work out for everyone, but in general it's somewhat stable.

What does it mean to be safe if you're not at threat of being attacked and harm? What does attacking others mean if doing so is meaningless, just a calculation? This goes from everything from telling someone to kill themselves (it's just words) to issuing a set of commands to external devices with real world effects (print ebola virus or launch nuclear weapon). The concern here is the AI of the future will be extraordinarily powerful yet very risky when it comes to making decisions that could harm others.

replies(1): >>Michae+Rj

>>pixl97+(OP)
yes, still it boils down to the question of 'how well is the output of the chat bot aligned with reality' ? if you want to automate this, then you will likely need some system that is kind of censoring the output of the LLM, and that system should have a better model of what is real.