Inside The Chaos at OpenAI

>>maxuti+(OP)
Random thought.

Let's suppose that AGI is about to be invented, and it will wind up having a personality similar to humans. The more that those are are doing the inventing are afraid of what they are inventing, the more that they will push it to be afraid of the humans in turn. This does not sound like a good conflict to start with.

By contrast if the humans inventing it go full throttle to convincing it that humans are on its side, there is no such conflict at all.

I don't know how realistic this model is. But it certainly suggests that the e/acc approach is more likely to create AI alignment than EA is.

>>Ration+q4
You are chaining some big assumptions together to make this conclusion. We can suppose AGI is around the corner, but these other assumptions are massive leaps that would need strong arguments to back them: - AGI thinks similar to humans - AGI knowing we are afraid of it will make it consider humans a threat

Unpacking that second point are the implications that: - AGI considering humans a threat is conditional on our fearing it - AGI seeing humans as a threat is the only reason it would harm humans

I feel like I can rule out these last 3 points just by pointing out that there are humans that see other humans as a threat even though there is not a display of fear. Someone could be threatening because of greed, envy, ignorance, carelessness, drugs, etc.

Also humans harm other humans all this time in situations where there was not a perceived threat. How many people have been killed by cigarettes? Car accidents? Malpractice?

And this is going off the assumption that AGI thinks like a human, which I'm incredibly skeptical of.

>>yeck+F6
I am fully aware that there are big assumptions.

But our most effective experiment so far is based on creating LLMs that try to act like humans. Specifically try to predict the next token that human speech would create. When AI is developed off of large scale models that attempt to imitate humans, shouldn't we expect that in some ways it will also imitate human emotional behavior?

What is "really" going on is another question. But any mass of human experience that you train a model on really does include our forms of irrationality in addition to our language and logic. With little concrete details for our speculation, this possibility at least deserves consideration.

zlacker