Chess-GPT's Internal World Model

>>homarp+(OP)
If you take a neural network that already knows the basic rules of chess and train it on chess games, you produce a chess engine.

From the Wikipedia page on one of the strongest ever[1]: "Like Leela Zero and AlphaGo Zero, Leela Chess Zero starts with no intrinsic chess-specific knowledge other than the basic rules of the game. Leela Chess Zero then learns how to play chess by reinforcement learning from repeated self-play"

[1]: https://en.wikipedia.org/wiki/Leela_Chess_Zero

>>wavemo+tm1
As described in the OP's blog post https://adamkarvonen.github.io/machine_learning/2024/01/03/c... - one of the incredible things here is that the standard GPT architecture, trained from scratch from PGN strings alone, can intuit the rules of the game from those examples, without any notion of the rules of chess or even that it is playing a game.

Leela, by contrast, requires a specialized structure of iterative tree searching to generate move recommendations: https://lczero.org/dev/wiki/technical-explanation-of-leela-c...

Which is not to diminish the work of the Leela team at all! But I find it fascinating that an unmodified GPT architecture can build up internal neural representations that correspond closely to board states, despite not having been designed for that task. As they say, attention may indeed be all you need.

>>btown+Np1
I think that “intuit the rules” is just projecting.

More likely, the 16 million games just has most of the piece move combinations. It does not know a knight moves in an L. It knows from each square where a knight can move based on 16 million games.

>>wredue+Q32
You assert something that is a hypothesis for further research in the area. Alternative is that it in fact knows that knights move in an L-shaped fashion. The article is about testing hypotheses like that, except this particular one seems quite hard.

>>baq+J52
It'd seem surprising to me if it had really learnt the generalization that knights move in an L-shaped fashion, especially since it's model of the board position seems to be more probabilistic than exact. We don't even know if it's representation of the board is spatial or not (e.g. that columns a & b are adjacent, or that rows 1 & 3 are two rows apart).

We also don't know what internal representations of the state of play it's using other than what the author has discovered via probes... Maybe it has other representations effectively representing where pieces are (or what they may do next) other than just the board position.

I'm guessing that it's just using all of it's learned representations to recognize patterns where, for example, Nf3 and Nh3 are both statistically likely, and has no spatial understanding of the relationship of these moves.

I guess one way to explore this would be to generate a controlled training set where each knight only ever makes a different subset of it's legal (up to) 8 moves depending on which square it is on. Will the model learn a generalization that all L-shaped moves are possible from any square, or will it memorize the different subset of moves that "are possible" from each individual square?

zlacker