zlacker

[return to "How does misalignment scale with model intelligence and task complexity?"]
1. Curiou+g4[view] [source] 2026-02-03 00:56:41
>>salkah+(OP)
This is a good line: "It found that smarter entities are subjectively judged to behave less coherently"

I think this is twofold:

1. Advanced intelligence requires the ability to traverse between domain valleys in the cognitive manifold. Be it via temperature or some fancy tunneling technique, it's going to be higher error (less coherent) in the valleys of the manifold than naive gradient following to the local minima.

2. It's hard to "punch up" when evaluating intelligence. When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.

◧◩
2. xander+v6[view] [source] 2026-02-03 01:10:40
>>Curiou+g4
What do 'domain valleys' and 'tunneling' mean in this context?
◧◩◪
3. FuckBu+tz[view] [source] 2026-02-03 04:56:36
>>xander+v6
So, the hidden mental model that the OP is expressing and failed to elucidate on is that llm’s can be thought of as compressing related concepts into approximately orthogonal subspaces of the vector space that is upper bounded by the superposition of all of their weights. Since training has the effect of compressing knowledge into subspaces, a necessary corollary of that fact is that there are now regions within the vector space that contain nothing very much. Those are the valleys that need to be tunneled through, ie the model needs to activate disparate regions of its knowledge manifold simultaneously, which, seems like it might be difficult to do. I’m not sure if this is a good way of looking at things though, because inference isn’t topology and I’m not sure that abstract reasoning can be reduced down to finding ways to connect concepts that have been learned in isolation.
[go to top]