zlacker

[return to "Chess-GPT's Internal World Model"]
1. wavemo+tm1[view] [source] 2024-01-07 05:16:56
>>homarp+(OP)
If you take a neural network that already knows the basic rules of chess and train it on chess games, you produce a chess engine.

From the Wikipedia page on one of the strongest ever[1]: "Like Leela Zero and AlphaGo Zero, Leela Chess Zero starts with no intrinsic chess-specific knowledge other than the basic rules of the game. Leela Chess Zero then learns how to play chess by reinforcement learning from repeated self-play"

[1]: https://en.wikipedia.org/wiki/Leela_Chess_Zero

◧◩
2. btown+Np1[view] [source] 2024-01-07 05:59:27
>>wavemo+tm1
As described in the OP's blog post https://adamkarvonen.github.io/machine_learning/2024/01/03/c... - one of the incredible things here is that the standard GPT architecture, trained from scratch from PGN strings alone, can intuit the rules of the game from those examples, without any notion of the rules of chess or even that it is playing a game.

Leela, by contrast, requires a specialized structure of iterative tree searching to generate move recommendations: https://lczero.org/dev/wiki/technical-explanation-of-leela-c...

Which is not to diminish the work of the Leela team at all! But I find it fascinating that an unmodified GPT architecture can build up internal neural representations that correspond closely to board states, despite not having been designed for that task. As they say, attention may indeed be all you need.

◧◩◪
3. goatlo+dv1[view] [source] 2024-01-07 07:18:09
>>btown+Np1
What's the strength of play for the GPT architecture? It's impressive that it figures out the rules, but does it play strong chess?

>> As they say, attention may indeed be all you need.

I don't think drawing general conclusions about intelligence from a board game is warranted. We didn't evolve to play chess or Go.

◧◩◪◨
4. eigenk+ZD1[view] [source] 2024-01-07 09:20:00
>>goatlo+dv1
> What's the strength of play for the GPT architecture?

Pretty shit for a computer. He says his 50m model reached 1800 Elo (by the way, its Elo and not ELO as the article incorrectly has it, it is named after a Hungarian guy called Elo). It seems to be a bit better than Stockfish level 1 and a bit worse than Stockfish level 2 from the bar graph.

Based on what we know I think its not surprising these models can learn to play chess, but they get absolutely smoked by a "real" chess bot like Stockfish or Leela.

◧◩◪◨⬒
5. jhrmnn+FH1[view] [source] 2024-01-07 10:15:00
>>eigenk+ZD1
They also say “I left one training for a few more days and it reached 1500 ELO.” I find it quite likely the observed performance is largely limited by the spent compute.
[go to top]