A non-anthropomorphized view of LLMs

>>zdw+(OP)
The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

>>barrke+k5
I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.

These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).

>>gugago+c8
Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.

>>barrke+v9
That's not what "state" means, typically. The "state of mind" you're in affects the words you say in response to something.

Intermediate activations isn't "state". The tokens that have already been generated, along with the fixed weights, is the only data that affects the next tokens.

>>gugago+wa
Plus a randomness seed.

The 'hidden state' being referred to here is essentially the "what might have been" had the dice rolls gone differently (eg, been seeded differently).

>>NiloCK+Tj
No, that's not quite what I mean. I used the logits in another reply to point out that there is data specific to the generation process that is not available from the tokens, but there's also the network activations adding up to that state.

Processing tokens is a bit like ticks in a CPU, where the model weights are the program code, and tokens are both input and output. The computation that occurs logically retains concepts and plans over multiple token generation steps.

That it is fully deterministic is no more interesting than saying a variable in a single threaded program is not state because you can recompute its value by replaying the program with the same inputs. It seems to me that this uninteresting distinction is the GP's issue.

zlacker