NNs cannot apply a 'concept' across different 'effect' domains, because they have only one effect domain: the training data. They are just models of how the effect shows itself in that data.
This is why they do not have world models: they are not generalising data by building an effect-neutral model of something; theyre just modelling its effects.
Compare having a model of 3D vs. a model of shadows of a fixed number of 3D objects. NNs generalise in the sense that they can still predict for shadows similar to their training set. They cannot predict 3d; and with sufficiently novel objects, fail catastrophically.
https://arxiv.org/abs/2311.00871
https://arxiv.org/abs/2309.13638