zlacker

I think that “intuit the rules” is just projecting.

More likely, the 16 million games just has most of the piece move combinations. It does not know a knight moves in an L. It knows from each square where a knight can move based on 16 million games.

replies(3): >>baq+T1 >>btown+nb >>edgyqu+vk

>>wredue+(OP)
You assert something that is a hypothesis for further research in the area. Alternative is that it in fact knows that knights move in an L-shaped fashion. The article is about testing hypotheses like that, except this particular one seems quite hard.

replies(2): >>empath+X9 >>HarHar+oq

>>baq+T1
I think one thing the conversation around LLMs has shown is how poorly defined words like "know" are.

>>wredue+(OP)
On a board with a finite number of squares, is this truly different?

The representation of the ruleset may not be the optimal Kolmogorov complexity - but for an experienced human player who can glance at a board and know what is and isn’t legal, who is to say that their mental representation of the rules is optimizing for Kolmogorov complexity either?

>>wredue+(OP)
No this isn’t likely. Chess has trillions of possible games[1] that could be played and if it all it took was such a small number of games to hit most piece combinations chess would be solved. It has to have learned some fundamental aspects of the game to achieve the rating stated ITT

1. https://en.m.wikipedia.org/wiki/Shannon_number#:~:text=After....

replies(1): >>wredue+3r

>>baq+T1
It'd seem surprising to me if it had really learnt the generalization that knights move in an L-shaped fashion, especially since it's model of the board position seems to be more probabilistic than exact. We don't even know if it's representation of the board is spatial or not (e.g. that columns a & b are adjacent, or that rows 1 & 3 are two rows apart).

We also don't know what internal representations of the state of play it's using other than what the author has discovered via probes... Maybe it has other representations effectively representing where pieces are (or what they may do next) other than just the board position.

I'm guessing that it's just using all of it's learned representations to recognize patterns where, for example, Nf3 and Nh3 are both statistically likely, and has no spatial understanding of the relationship of these moves.

I guess one way to explore this would be to generate a controlled training set where each knight only ever makes a different subset of it's legal (up to) 8 moves depending on which square it is on. Will the model learn a generalization that all L-shaped moves are possible from any square, or will it memorize the different subset of moves that "are possible" from each individual square?

replies(1): >>pama+LZ

>>edgyqu+vk
It doesn’t take the consumption of all trillions of possible game states to see a majority of possible ways a piece can move from one square to another.

Maybe I misread something as I only skimmed, but the pretty weak Elo would most definitely suggest a failure of intuiting rules.

replies(1): >>rmorey+0P

>>wredue+3r
no, a weak elo just indicates poor play. he also quantifies what percent of moves the model makes which are legal, and it’s ~99%, meaning it must have learned the rules

replies(1): >>wredue+UZ

>>HarHar+oq
A minor detail here is that the analysis in the blog shows that the linear model built/trained on the the activations of an internal layer has a representation of the board that is probabilistic. Of course the full model is also probabilistic by design, though it probably has a better internal understanding of the state of the board than the linear projection used to visualize/interpret the internals of the model. There is no real meaning in the word "spatial" representation beyond the particular connectivity of the graph of the locations, which seems to be well understood by the model as 98% of the moves are valid, and this includes sampling with whatever probabilistic algorithm of choice that may not always return the best move of the model.

A different way to test the internal state of the model would be to score all possible valid and invalid moves at every position and see how the probabilities of these moves would change as a function of the player's ELO rating. One would expect that invalid moves would always score poorly independent of ELO, whereas valid moves would score monotonically with how good they are (as assessed by Stockfish) and that the player's ELO would stretch that monotonic function to separate the best moves from the weakest moves for a strong player.

replies(1): >>HarHar+Bb1

>>rmorey+0P
My kids also make 99% legal moves and don’t know much more than how the pieces move.

You’re really wishing a lot more in to AI than is actually there.

replies(2): >>Terret+891 >>rmorey+vg1

>>wredue+UZ
So if not artificial, what form of intelligence have the kids reached and is it any more or less impressive?

Certainly a lot of folks wish more into kids than is really there.

>>pama+LZ
> There is no real meaning in the word "spatial" representation beyond the particular connectivity of the graph of the locations

I don't think it makes sense to talk of the model (potentially) knowing that knights make L-shaped moves (i.e. 2 squares left or right, plus 1 square up or down, or vice versa) unless it is able to add/subtract row/column numbers to be able to determine the squares it can move to on the basis of this (hypothetical) L-shaped move knowledge.

Being able to do row/column math is essentially what I mean by spatial representation - that it knows the spatial relationships between rows ("1"-"8") and columns ("a"-"h"), such that if it had a knight on e1 it could then use this L-shaped move knowledge to do coordinate math like e1 + (1,2) = f3.

I rather doubt this is the case. I expect the board representation is just a map from square name (not coordinates) to piece on that square, and that generated moves likely are limited to those it saw the piece being moved make when it had been on the same square during training - i.e. it's not calculating possible, say, knight destinations base on an L-shaped move generalization, but rather "recalling" a move it had seen during training when (among other things) it had a knight on a given square.

Somewhat useless speculation perhaps, but would seem simple and sufficient, and an easy hypothesis to test.

replies(1): >>baq+SP2

>>wredue+UZ
That's entirely my point: both your kids and ChessGPT know the rules, but still don't play very strongly. You say they "don’t know much more than how the pieces move" but that's exactly what the rules are, how the pieces are allowed to move, given the sequence of moves that have come before (i.e the state of the board.) I'm saying ChessGPT is a poor player, and didn't learn much high level play. But it definitely learned the rules!

>>HarHar+Bb1
IMHO this would be a publishable result.