In order to really get a good assessment of if the interview ratings were effective, they'd need to also hire some random unbiased sample of those who fail the interview process. There are alternative ways of slicing the data to help give insight, such as looking at only those who barely passed the interview process, or looking only at the bottom 10% of performers. However, when you're looking at such a highly biased sample (only the small-ish percentage of people hired), it's hard to say what the correlation is across the entire interview population.
At the risk of repeating myself, we don't particularly care the predictive power of the scores across the whole range, only their predictive power across those who aren't obvious no-hires and those who aren't obvious hires. That's the range where the power of the interview scores as a decision-making tool is most important.
Also, if two metrics disagree, it's not clear which one is problematic. It's possible that a poor correlation indicates that there's a problem with the performance rating system.
You haven't, Googles interviews are correlated to job performance. They have data on it internally, people who work there can look. What you probably read was that brain teasers like "why are manhole covers round" doesn't correlate to job performance.