https://www.youtube.com/watch?v=3wMKoSRbGVs&pp=ygUabGV4IGZya...
He said they learned after ~5 years that this was an order of magnitude off -- it's more like 10 M things.
Is there any literature about this? Did they publish?
To me, the obvious questions are -
- how do they know it's not 100M things?
- how do they know it's even bounded? Why isn't there a combinatorial explosion?
I mean I guess they were evaluating the system all along. You don't go for 38 years without having some clear metrics. But I am having some problems with the logic -- I'd be interested in links to references / criticism.
I'd be interested in any arguments for and against ~10 M. Naively speaking, the argument seems a bit flawed to me.
FWIW I heard of Cyc back in the 90's, but I had no idea it was still alive. It is impressive that he kept it alive for so long.
---
Actually the wikipedia article is pretty good
https://en.wikipedia.org/wiki/Cyc#Criticisms
Though I'm still interested in the ~1M or ~10M claim. It seems like a strong claim to hold onto for decades, unless they had really strong metrics backing it up.
> how do they know it's even bounded? Why isn't there a combinatorial explosion?
I don't know - I'm in middle of watching the interview too, but he's moved on from that topic already. I'd guess the 10M vs 1M (or 100M) estimate comes from the curve of total "assertions" vs time leveling off towards some asymptotic limit.
I suppose the reason there's no combinatorial explosion is because they're entering these assertions in most general form possible, so considering new objects doesn't necessarily mean new assertions since it may all be covered by the superclasses the objects are part of (e.g. few assertions that are specific to apples since most will apply to all fruit).