zlacker

It would be cool if we could find the algorithmic neurological basis for this, the analogy with LLMs being more obvious multi-layer brain circuits, the neurological analogy with symbolic reasoning must exist too.

My hunch is it emerges naturally out of the hierarchical generalization capabilities of multiple layer circuits. But then you need something to coordinate the acquired labels: a tweak on attention perhaps?

Another characteristic is probably some (limited) form of recursion, so the generalized labels emitted at the small end can be fed back in as tokens to be further processed at the big end.