Chess-GPT's Internal World Model

>>homarp+(OP)
World model might be a too big word here. When we talk of a world model (in the context of AI motels), we refer to its understanding of the world, at least in the context we trained it. But what I see is just a visualization of the output in a fashion similar to a chess board. A stronger evidence would be a for example a map of the next move, which will show whether it truly understood the game’s rules. If it show probability larger than zero on illegal board fields, it will show us why it sometimes makes illegal moves. And obviously, it didn’t fully understand the rules of the game.

>>sinuhe+4Z
> probability larger than zero

Strictly speaking, it should be a mistake to assign a probability equal to zero to any moves, even for illegal board moves, but especially for an AI that learns by example and self-play. It never gets taught the rules, it only gets shown the games -- there's no reason that it should conclude that the probability of a rook moving diagonally is exactly zero just because it's never seen it happen in the data, and gets penalized in training every time it tries it.

But even for a human, assigning probability of exactly zero is too strong. It would forbid any possibility that you misunderstand any rules, or forgot any special cases. It's a good idea to always maintain at least a small amount of epistemic humility that you might be mistaken about the rules, so that sufficiently overwhelmingly strong evidence could convince you that a move you thought was illegal turns out to be legal.

>>mitthr+n61
The rules of chess are small and well known. For example, rooks can't go diagonal no matter the situation. There's no need for epistemic humility.

>>goatlo+Av1
Every so often, I encounter someone saying that about some topic while also being wrong.

Also, it took me actually writing a chess game to learn about en passant capturing, the 50 moves without capturing or pawn move forced draw, and the 3 state repetition forced draw.

>>ben_w+uF1
But the topic is chess, which does have a small number of fixed rules. You not knowing about en passant or 3 state repetition just means you never bothered to read all the rules. At some point, an LLM will learn the complete rule set.

>>goatlo+zT3
> At some point, an LLM will learn the complete rule set.

Even if it does, it doesn't know that it has. And in principle, you can't know for sure if you have or not either. It's just a question of what odds you put on having learned a simplified version for all this time without having realised that yet. Or, if you're a professional chess player, the chance that right now you're dreaming and you're about to wake up and realise you dreamed about forgetting the 𐀀𐀁𐀂𐀃:𐀄𐀅𐀆𐀇𐀈𐀉 move that everyone knows (and you should've noticed because the text was all funny and you couldn't read it, which is a well-known sign of dreaming).

That many people act like things can be known 100% (including me) is evidence that humans quantise our certainty. My gut feeling is that anything over 95% likely is treated as certain, but this isn't something I've done any formal study in, and I'd assume that presentation matters to this number because nobody's[0] going to say that a D20 dice "never rolls a 1". But certainty isn't the same as knowledge, it's just belief[1].

[0] I only noticed at the last moment that this itself is an absolute, so I'm going to add this footnote saying "almost nobody".

[1] That said, I'm not sure what "knowledge" even is: we were taught the tripartite definition of "justified true belief", but as soon as it was introduced to us the teacher showed us the flaws, so I now regard "knowledge" as just the subjective experience of feeling like you have a justified true belief, where all that you really have backing up the feeling is a justified belief with no way to know if it's true, which obviously annoys a lot of people who want truth to be a thing we can actually access.

zlacker