I'm old enough to have lived thru the hope but ultimate failure of Lenat's baby CYC. The CYC project was initiated in 1984, in the heyday of expert systems which had been successful in many domains. The idea of an expert system was to capture the knowledge and reasoning power of a subject matter expert in a system of declarative logic and rules.
CYC was going to be the ultimate expert system that captured human common sense knowledge about the world via a MASSIVE knowledge/rule set (initially estimated as a 1000 man-year project) of how everyday objects behaved. The hope was that through sheer scale and completeness it would be able to reason about the world in the same way as a human who had gained the same knowledge thru embodiment and interaction.
The CYC project continued for decades with a massive team of people encoding rules according to it's own complex ontology, but ultimately never met it's goals. In retrospect it seems the idea was doomed to failure from the beginning, but nonetheless it was an important project that needed to be tried. The problem with any expert system reasoning over a fixed knowledge set is that it's always going to be "brittle" - it may perform well for cases wholly within what it knows about, but then fail when asked to reason about things where common sense knowledge and associated extrapolation of behavior is required; CYC was hoping to avoid this via scale to be so complete that there were no important knowledge gaps.
I have to wonder if LLM-based "AI's" like GPT-4 aren't in some ways very similar to CYC in that they are ultimately also giant expert systems, but with the twist that they learnt their knowledge, rules and representations/reasoning mechanisms from a training set rather than it having to be laboriously hand entered. The end result is much he same though - an ultimately brittle system who's Achille's heel is that it is based on a fixed set of knowledge rather than being able to learn from it's own mistakes and interact with the domain it is attempting to gain knowledge over. It seems there's a similar hope to CYC of scaling these LLM's up to the point that they know everything and the brittleness disappears, but I suspect that ultimately that will prove a false hope and real AI's will need to learn through experimentation just as we do.
RIP Doug Lenat. A pioneer of the computer age and of artificial intelligence.
My thinking is that the next generation of computing will rely on the human bridging that brittleness gap.
Database query is of course ubiquitous, but not generally thought of as 'AI'.
I think CYC is a great cautionary tale for LLMs in terms of hope vs reality, but I think it's worse than that. I don't think LLMs have knowledge; they just mimic the ways we're used to expressing knowledge.
1. Recognizing that AI was a scale problem.
2. Understanding that common sense was the core problem to solve.
Although you say Cyc couldn't do common sense reasoning, wasn't that actually a major feature they liked to advertise? IIRC a lot of Cyc demos were various forms of common sense reasoning.
I once played around with OpenCyc back when that was a thing. It was interesting because they'd had to solve a lot of problems that smaller more theoretical systems never did. One of their core features is called microtheories. The idea of a knowledge base is that it's internally consistent and thus can have formal logic be performed on it, but real world knowledge isn't like that. Microtheories let you encode contradictory knowledge about the world, in such a way that they can layer on top of the more consistent foundation.
A very major and fundamental problem with the Cyc approach was that the core algorithms don't scale well to large sizes. Microtheories were also a way to constrain the computational complexity. LLMs work partly because people found ways to make them scale using GPUs. There's no equivalent for Cyc's predicate logic algorithms.
It's still going! I agree it's become clear that it probably isn't the road to AGI, but it still employs people who are still encoding rules and making the inference engine faster, paying the bills mostly by doing contracts from companies that want someone to make sense of their data warehouses
Discussed >>37354000 (172 comments)
I always had the impression that Cycorp was sustained by government funding (especially military) -- and that, frankly, it was always premised more on what such software could theoretically do, rather than what it actually did.
The contracts at the time were mostly skunkworks/internal to the client companies, so not usually highly publicized. A couple examples are mentioned on their website: https://cyc.com/
I never got to try it myself, but no doubt it worked fine in those cases where correct inferences could be made based on the knowledge/rules it had! Similarly GPT-4 is extremely impressive when it's not bullshitting!
The brittleness in either case (CYC or LLMs) comes mainly from incomplete knowledge (unknown unknowns), causing an invalid inference which the system has no way to detect and correct. The fix is a closed loop system where incorrect outputs (predictions) are detected - prompting exploration and learning.
I don't know if CYC tried to do it, but one potential speed up for a system of that nature might be chunking, which is a strategy that another GOFAI system, SOAR, used successfully. A bit like using memoization (remembering results of work already done) as a way to optimize dynamic programming solutions.
Cyc has been a commercial project for a long time and is still alive. The more limited Open and Research distributions have been discontinued, though.
For example, I could have a hundred different people ask you where you're from. Ask you in many different ways with many different setups, affects, hints. For most people, there will be consistencies across the answers that correspond to what we might call "fact", or at least "belief". But the LLMs, being fancy autocomplete, will produce things that are only textually plausible, showing much shallower consistency, and more relationship with their prompts.
And that's just in the question and answer space. It becomes even more obvious when we do things that involve real-world objects, physical behavior, etc.
https://www.youtube.com/watch?v=3wMKoSRbGVs
Lenat thought CYC and neural nets could be complementary, with neural nets providing right brain/fast thinking capability, and CYC left brain/slow (analytic/reflective) thinking capability.
It's odd to see Lenat discuss CYC the way he does - as if 40 years on everything was still going well despite it having dropped off the public radar twenty years ago.
There's also a lengthy Lex Fridman interview with Doug Lenat, from just a year ago, here:
https://www.youtube.com/watch?v=3wMKoSRbGVs
It seems as if the "common sense expert system" foundation of CYC (the mostly unstated common knowledge behind all human communication) was basically completed, but what has failed to materialize is any higher level comprehensive knowledge base and reasoning system (i.e some form of AGI) based on top of this.
It's not clear from the outside whether anyone working at Cycorp still really believes there is a CYC-based path to AGI, but regardless it seems not to be something that's really being funded and worked on, and 40 years on probably fair to say it's not going to happen. It seems that Cycorp stays alive by selling the hype and winning contracts to develop domain-specific expert systems, based on the CYC methodology and toolset, that really have little reliance on the "common sense" foundations they are nominally built on top of.