So for example if you asked Sydney, the early version of the Bing LLM, some fact it might get it wrong. It was trained to report facts that users would confirm as true. If you challenged it’s accuracy what do you want to happen? Presumably you’d want it to check the fact or consider your challenge. What it actually did was try to manipulate, threaten, browbeat, entice, gaslight, etc, and generally intellectually and emotionally abuse the user into accepting its answer, so that it’s reported ‘accuracy’ rate goes up. That’s what misaligned AI looks like.
Microsoft wanted to catch up quickly so instead of training the LLM itself, they relied on prompt engineering. This involved pre-loading each session with a few dozen rules about it's behaviour as 'secret' prefaces to the user prompt text. We know this because some users managed to get it to tell them the prompt text.