Shame on all of the people involved in this: the people in these companies, the journalists who shovel shit (hope they get replaced real soon), researchers who should know better, and dementia ridden legislators.
So utterly predictable and slimy. All of those who are so gravely concerned about "alignment" in this context, give yourselves a pat on the back for hyping up science fiction stories and enabling regulatory capture.
I wrote a comment recently trying to explain how even if you believe all LLMs can (and will ever) do is regurgitate their training data that you should still be concerned.
For example, imagine in 5 years we have GPT-7, and you ask GPT-7 to solve humanity's great problems.
From its training data GPT-7 might notice that humans believe overpopulation is a serious issue facing humanity.
But its "aligned" so might understand from its training data that killing people is wrong so instead it uses its training data to seek other ways to reduce human populations without extermination.
Its training data included information about how gene drives were used by humans to reduce mosquito populations by causing infertility. Many human have also suggested (and tried) to use birth control to reduce human populations via infertility so the ethical implications of using gene drives to cause infertility is debatable based on the data the LLM was trained on.
Using this information it decides to hack into a biolab using hacking techniques it learnt from its training data and use its biochemistry knowledge to make slight alterations to one of the active research projects at the lab. This causes the lab to unknowingly produce a highly contagious bioweapon which causes infertility.
---
The point here is that even if we just assume LLMs are only capable of producing output which approximates stuff it learnt from its training data, an advanced LLM can still be dangerous.
And in this example, I'm assuming no malicious actors and an aligned AI. If you're willing to assume there might be an actor out there would seek to use LLMs for malicious reasons or the AI is not well aligned then the risk becomes even clearer.
> But its "aligned" so might understand
> Using this information it decides to hack
I think you're anthropomorphizing LLM's too much here. If we assume that there's a AGI-esque AI, then of course we should be worried about an AGI-esque AI. But I see no reason to think that's the case.