zlacker

[parent] [thread] 2 comments
1. empath+(OP)[view] [source] 2024-01-07 15:45:52
There's a _lot_ of evidence that LLM's _do_ generalize, though.
replies(2): >>mjburg+x5 >>lossol+ZN
2. mjburg+x5[view] [source] 2024-01-07 16:26:05
>>empath+(OP)
There's many notions of "prediction" and "generalisation" -- the relevant ones here, which apply to NNs, are extremely limited. That's the problem with all this deceptive language -- it invites people to think NNs predict in the sense of simulate, and generalise in the sense of "apply across different effect domains".

NNs cannot apply a 'concept' across different 'effect' domains, because they have only one effect domain: the training data. They are just models of how the effect shows itself in that data.

This is why they do not have world models: they are not generalising data by building an effect-neutral model of something; theyre just modelling its effects.

Compare having a model of 3D vs. a model of shadows of a fixed number of 3D objects. NNs generalise in the sense that they can still predict for shadows similar to their training set. They cannot predict 3d; and with sufficiently novel objects, fail catastrophically.

3. lossol+ZN[view] [source] 2024-01-07 21:20:19
>>empath+(OP)
There is also a lot of evidence lately that they do not generalize.

https://arxiv.org/abs/2311.00871

https://arxiv.org/abs/2309.13638

https://arxiv.org/abs/2311.09247

https://arxiv.org/abs/2305.18654

[go to top]