This results in the appearance of an arms race between world model refinement and user cleverness, but it's really a fundamental expressive limitation: the user can always recurse, but the model can only predict tokens.
(There are a lot of contexts in which this distinction doesn't matter, but I would argue that it does matter for a meaningful definition of human-like understanding.)