zlacker

It's not self-play. It's literally just reading sequences of moves. And it doesn't even know that they're moves, or that it's supposed to be learning a game. It's just learning to predict the next token given a sequence of previous tokens.

What's kind of amazing is that, in doing so, it actually learns to play chess! That is, the model weights naturally organize into something resembling an understanding of chess, just by trying to minimize error on next-token prediction.

It makes sense, but it's still kind of astonishing that it actually works.