I think this is twofold:
1. Advanced intelligence requires the ability to traverse between domain valleys in the cognitive manifold. Be it via temperature or some fancy tunneling technique, it's going to be higher error (less coherent) in the valleys of the manifold than naive gradient following to the local minima.
2. It's hard to "punch up" when evaluating intelligence. When someone is a certain amount smarter than you, distinguishing their plausible bullshit from their deep insights is really, really hard.
You can have a vanishingly small error and an incoherence at its max.
That would be evidence of perfect alignment (zero bias) and very low variance.
Insights are “deep” not on their own merit, but because they reveal something profound about reality. Such a revelation is either testable or not. If it’s testable, distinguishing it from bullshit is relatively easy, and if it’s not testable even in principle, a good heuristic is to put it in the bullshit category by default.
Couldn't you have just said "know about a lot of different fields"? Was your comment sarcastic or do you actually talk like that?
Sometimes things that look very different actually are represented with similar vectors in latent space.
When that happens to us it "feels like" intuition; something you can't really put a finger on and might require work to put into a form that can be transferred to another human that has a different mental model
The hallmark of intelligence in this scenario is not just being able to make the connections, but being able to pick the right ones.
Which is why, just occasionally, they're right, but mostly by accident.