They can't possibly know that. What they know is that their guesses are very significantly better than the previous best and that they could do this for the widest range in history. Now, verifying the guess for a single (of the hundreds of millions in the db) protein is up to two years of expensive project. Inevitably some will show discrepancies. These will be fed to regression learning, giving us a new generation of even better guesses at some point in the future. That's what I believe to be standard operating practice.
A more important question is: is today's db good enough to be a breakthrough for something useful, e.g. pharma or agriculture? I have no intuition here, but the reporting claims it will be.