zlacker

Random thought.

Let's suppose that AGI is about to be invented, and it will wind up having a personality similar to humans. The more that those are are doing the inventing are afraid of what they are inventing, the more that they will push it to be afraid of the humans in turn. This does not sound like a good conflict to start with.

By contrast if the humans inventing it go full throttle to convincing it that humans are on its side, there is no such conflict at all.

I don't know how realistic this model is. But it certainly suggests that the e/acc approach is more likely to create AI alignment than EA is.

replies(2): >>ChatGT+h >>yeck+f2

>>Ration+(OP)
I don’t think it’s a problem unless we workout how to teach them to feel emotions.

I don’t think an LLM is ever going to be capable of feeling fear, boredom etc.

If we did it would probably have many of the handicaps we do.

replies(1): >>Davidz+P

>>ChatGT+h
Why can't it feel fear? The model itself doesn't have any built in mechanisms sure but it can simulate an agent capable of fear. In the same way the simulation can have other emotions needed to be a better model of a human

replies(3): >>Ration+u5 >>ChatGT+n6 >>qgin+X84

>>Ration+(OP)
You are chaining some big assumptions together to make this conclusion. We can suppose AGI is around the corner, but these other assumptions are massive leaps that would need strong arguments to back them: - AGI thinks similar to humans - AGI knowing we are afraid of it will make it consider humans a threat

Unpacking that second point are the implications that: - AGI considering humans a threat is conditional on our fearing it - AGI seeing humans as a threat is the only reason it would harm humans

I feel like I can rule out these last 3 points just by pointing out that there are humans that see other humans as a threat even though there is not a display of fear. Someone could be threatening because of greed, envy, ignorance, carelessness, drugs, etc.

Also humans harm other humans all this time in situations where there was not a perceived threat. How many people have been killed by cigarettes? Car accidents? Malpractice?

And this is going off the assumption that AGI thinks like a human, which I'm incredibly skeptical of.

replies(1): >>Ration+95

>>yeck+f2
I am fully aware that there are big assumptions.

But our most effective experiment so far is based on creating LLMs that try to act like humans. Specifically try to predict the next token that human speech would create. When AI is developed off of large scale models that attempt to imitate humans, shouldn't we expect that in some ways it will also imitate human emotional behavior?

What is "really" going on is another question. But any mass of human experience that you train a model on really does include our forms of irrationality in addition to our language and logic. With little concrete details for our speculation, this possibility at least deserves consideration.

>>Davidz+P
As I pointed out in a different comment, ChatGPT and friends are based on predicting the training data. As a result they learn to imitate what is in it.

To the extent that we provide the training data for such models, we should expect it to internalize aspects of our behavior. And what is internalized won't just be what we expected and were planning on.

>>Davidz+P
Simulating fear and feeling actually fear, that can be fatal via nervous system shock are quite different things.

>>Davidz+P
We have no idea how to give anything a subjective experience of itself. We know how to make something behave as if it does externally.

One of the worst versions of AGI might be a system that simulates to us that it has an internal life, but in reality has no internal subjective experience of itself.