[0] https://writings.stephenwolfram.com/2023/09/remembering-doug...
But they have never truly exploited logic-based inference, except for some small academic efforts.
https://github.com/orgs/stardog-union/
Looks like Knowledge Graph and semantic reasoner are the search terms du'jour, I haven't tracked these things since OpenCyc stopped being active.
Humans may not be able to effectively trudge through the creation of trillions of little rules and facts needed for an explicit and coherent expert world model, but LLMs definitely can be used for this.
> Perhaps their time will come again.
That's pretty sure, as soon as the hype about LLMs has calmed down. I hope that Cyc's data will then still be available, ideally open-source.
> https://muse.jhu.edu/pub/87/article/853382/pdf
Unfortunately paywalled; does anyone have a downloadable copy?
I was one of the first hires on the Cyc project when it started at MCC and was at first responsible for the decision to abandon the Interlisp-D implementation and replace it with one I wrote on Symbolics machines.
Yes, back then one person could write the code base, which has long since grown and been ported off those machines. The KB is what matters anyway. I built it so different people could work on the kb simultaneously, which was unusual in those days, even though cloud computing was ubiquitous at PARC (where Doug had been working, and I had too).
Neurosymbolic approaches are pretty important and there’s good work going on in that area. I was back in that field myself until I got dragged away to work on the climate. But I’m not sure that manually curated KBs will make much of a difference beyond bootstrapping.
Of course the underlying storage can be (and often is) a bunch of specially prepared relational tables.
But the strength in graph databases comes from restating the problem in different way, with query languages targeting the specific problem space.
Similarly there are tasks where SQL will be plainly better.
The SQL standard now includes syntactic sugar for 'Property Graph Query'. Implementations are still in the works AIUI, but can be expected in the reasonably near future.
And for efficient implementation the database underneath still needs to have extended graph support (in fact, I find it hilarious that Oracle seems to be spearheading it, as they have previously canceled their graph support around 2012 - enough that I wrote about how it was deprecated and removed from support in my thesis in 2014.
Is there a typo in that question? Because this does not parse as a sentence in any way for me, and if that's part of the point, I don't understand how. How would a toddler answer this question (except for looking confused like an adult, or maybe making up some nonsense to go with it)?
Happy to be told how I missed the obvious!
(EDIT: By the way, I don't know why you got downvoted, I certainly didn't.)
I'd bet, judging mostly from my failed attempts at playing with OpenCyc around 2009, is that the Cyc has always been too closed and to complex to tinker with. That doesn't play nicely with academic work. When people finish their PhDs and start working for OpenAI, they simply don't have Cyc in their toolbox.
[1] https://www.sciencedirect.com/science/article/pii/S089360802...
My approach, Cyc's, and others are fundamentally flawed for the same reason. There's a low level reason why deep nets work and symbolic engines are very bad.
the frames, slots and values integrated were learned via a RNN for specific applications.
we even created a library for it called keyframe (modeling it after having the programmer specify the bot action states and have the model figure out the dialog in a structured way) - similar to how keyframes in animation work.
it would be interesting to resurrect that in the age of LLMs!
Now it's clear that knowledge graphs are far inferior to deep neural nets, but even still few people can explain the _root_ reason why.
I don't think Lenat's bet was a waste. I think it was sensible based on the information at the time.
The decision to research it largely in secret, closed source, I think was a mistake.
No. It depends. In general, two technologies can’t be assessed independently of the application.
_The_ (one) root reason? Ok, I’ll bite.
But you need to define your claim. What application?
[1] https://voidfarer.livejournal.com/623.html
You can label it "bad idea" but you can't bring LLMs back in time.
In fact, if you have a graph and a path-weighting model (RNN, TDCNN or Transformer), you can use beam search to evaluate paths through graphs.
I wonder if they've adopted ML yet.
Bonus points if that is combined with modern differentiable methods and SAT/SMT, i.e. neurosymbolic AI.
The hype of LLMs is not the reason the likes of Cyc have been abandoned.
There aren't any class of problems deep nets can't handle. Will they always be the most efficient or best performing solution ? No, but it will be possible.
"I wonder what is the closest thing to Cyc we have in the open source realm right now?".
See:
https://github.com/therohk/opencyc-kb
https://github.com/bovlb/opencyc
https://github.com/asanchez75/opencyc
Outside of that, you have the entire world of Semantic Web projects, especially things like UMBEL[1], SUMO[2], YAMATO[3], and other "upper ontologies"[4] etc.
[1]: https://en.wikipedia.org/wiki/UMBEL
[2]: https://en.wikipedia.org/wiki/Suggested_Upper_Merged_Ontolog...
I think the issue in this area is mostly to convince and sell to bureaucratic institutions.
StarDog is not FOSS, that github repo is for various utils around their proprietary package in my understanding, actual engine code is not open source.
It's pretty interesting to see comments like this like deep nets weren't the underdog for decades. You think they were first choice ? The creator of cyc spent decades on it, and he's dead. We use modern NNs today because they just work that much better.
Gofai was abandoned in NLP long before the likes of GPT because non deep-net alternatives just sucked that much. It has nothing to do with any recent LLM hype.
If the problem space is without clear definitions and unambiguous axioms then non deep-net alternatives fall apart.
and we don't call it hallucinations but gofai mispredicts plenty.
However I think they have a good excuse for 'Why didn't it ever have the impact that LLMs are having now?': lack of data and lack of compute.
And it's the same excuse that neural networks themselves have: back in those days, we just didn't have enough data, and we didn't have enough compute, even if we had the data.
(Of course, we learned in the meantime that neural networks benefit a lot from extra data and extra compute. Whether that can be brought to bear on Cyc-style symbolic approaches is another question.)
Cyc also has the equivalent of hallucinations, when their definitions don't cleanly apply to the real world.
I'm not sure deep-nets are the key here. I see the key as being lots of data and using statistical modeling. Instead of trying to fit what's happening into nice and clean black-and-white categories.
Btw, I don't even think Gofai is all that good at domains with clear definitions and unambiguous axioms: it took neural nets to beat the best people at the very clearly defined game of Go. And neural net approaches have also soundly beaten the best traditional chess engines. (Traditional chess engines have caught up a lot since then. Competition is good for development, of course.)
I suspect part of the problem for Gofai is that all the techniques that work are re-labelled to be just 'normal algorithms', like A* or dynamic programming etc, and no longer bear the (Gof) AI label.
(Tangent: that's very similar to philosophy. Where every time we turn anything into a proper science, we relabel it from 'natural philosophy' to something like 'physics'. John von Neumann was one of these recent geniuses who liberated large swaths of knowledge from the dark kludges of the philosophy ghetto.)
I am really pleased they continue to work on this - it is a lot of work, but it needs to be done and checked manually, once done the base stuff shouldn't change much and it will be a great common sense check for generated content.
> Tangent: that's very similar to philosophy.
doesn't click with me. Maybe, could your elaborate a bit, or provide an example, please?
Have some vector for a concept match a KB entry etc, IDK :).
I don't want to rob you of your literary freedom, but that threw me off. Mainframes were meant, yes?
Was trying to find it the other day and AI searches suggested Cyc; I feel like that's not it, but maybe it was? (It definitely wasn't Everything2.)
Cyc was able to produce an impact, I keep pointing to MathCraft [1] which, at 2017, did not have a rival in the neural AI.
[1] https://www.width.ai/post/what-is-beam-search
It is possible to even have 3-gram model to output better text predictions if you combine it with the beam search.
FYI: here are the release notes of the recently release Allegro CL 11.0: https://franz.com/support/documentation/current/release-note...
IIRC, Cyc gets delivered on other platforms&languages (C, JVM, ... ?). Would be interesting to know what they use for deployment/delivery.
So looks like Cyc did have to fall back on a neural net after all (Lenat's).
The lead author on [1] is Kathy Panton who has no publications after that and zero internet presence as far as i can tell.
[1] Common Sense Reasoning – From Cyc to Intelligent Assistant https://iral.cs.umbc.edu/Pubs/FromCycToIntelligentAssistant-...
Reactome, is a graph, because that is the domain. But technically it does little with that fact (In my disappointed opinion).
Given that GO and Reactome are also relatively small academic efforts in general...
I'll tell you how I know /
I read it in the paper /
Fifteen years ago -
(John Prine)
I mostly wanted to know of any technical obstacles so SBCL could be improved. If I had to wildly guess, maybe GC performance? SBCL was behind ACL on that many years ago (on both speed and physical memory requirements) the last time I made a comparison.
If that is so then symbolic AI does not easily scale because you cannot feed inconsistent information into it. Compare this to how humans and LLMs learn, they both have no problem with inconsistent information. Yet statistically speaking humans can easily produce "useful" information.
No not at all. We’re talking early-mid 1980s so people in the research community (at least at the leading institutions) were by then pretty used to what’s called cloud computing these days. In fact the term “cloud” for independent resources you could call upon without knowing the underlying architecture came from the original Internet papers (talking originally about routing, and then the DNS) in the late 70s
So for example the mail or file or other services at PARC just lived in the network; you did the equivalent of an anycast to check your mail or look for a file. These had standardized APIs so it didn’t matter if you were running Smalltalk, Interlisp-D, or Cedar/Mesa you just had a local window into a general computing space, just as you do today.
Most was on the LAN, of course, as the ARPANET was pretty slow. But when we switched to TCP/IP the LAN/WAN boundaries became transparent and instead of manually bouncing through different machines I could casually check my mail at MIT from my desk at PARC.
Lispms were slightly less flexible in this regard back then, but then again Ethernet started at PARC. But even in the late 70s it wasn’t weird to have part of your computation run on a remote machine you weren’t logged into interactively.
The Unix guys at Berkeley eventually caught up with this (just look at the original sockets interface, very un-unixy) but they didn’t quite get it: I always laughed when I saw a sun machine running sendmail rather than trusting the network to do the right thing on its behalf. By the time Sun was founded that felt paleolithic to me.
Because I didn’t start computing until the late 70s I pretty much missed the whole removable media thing and was pretty much always network connected.
And there were description if EURISCO (with claims that it not only "win some game" but also that it "invented new structure of NAND-gate in silicon, used by industry now") and other expert systems.
One of the mentioned expert systems (without technical details) said was 2 times better in diagnose cancer than best human diagnostician of some university hospital.
And after that... Silence.
I always wonder, why did this expert system were not deployed in all USA hospitals, for example? If it is so good?
Now we have LLMs, but they are LANGUAGE models, not WORLD models. They predict distribution of possible next words. Same with images — pixels, not world concepts.
Looks like such systems are good for generating marketing texts, but can not be used as diagnosticians by definition.
Why did all these (slice of) world model approaches dead? Except Cyc, I think. Why we have good text generators and image generators but not diagnosticians 40 years later? What happens?..
Ultimately it failed, although people's opinions may differ. The company is still around, but from what people who've worked there have said, it seems as if the original goal is all but abandoned (although Lenat might have disagreed, and seemed eternally optimistic, at least in public). It seems they survive on private contracts for custom systems premised on the power of Cyc being brought to bear, when in reality these projects could be accomplished in simpler ways.
I can't help but see somewhat of a parallel between Cyc - an expert system scaling experiment, and today's LLMs - a language model scaling experiment. It seems that at heart LLMs are also rule-based expert systems of sorts, but with the massive convenience factor of learning the rules from data rather than needing to have the rules hand-entered. They both have/had the same promise of "scale it up and it'll achieve AGI", and "add more rules/data and it'll have common sense" and stop being brittle (having dumb failure modes, based on missing knowledge/experience).
While the underlying world model and reasoning power of LLMs might be compared to an expert system like Cyc, they do of course also have the critical ability to input and output language as a way to interface to this underlying capability (as well as perhaps fool us a bit with the ability to regurgitate human-derived surface forms of language). I wonder what Cyc would feel like in terms of intelligence and reasoning power if one somehow added an equally powerful natural language interface to it?
As LLMs continue to evolve, they are not just being scaled up, but also new functionality such as short term memory being added, so perhaps going beyond expert system in that regard, although there is/was also more to Cyc that just the massive knowledge base - a multitude of inference engines as well. Still, I can't help but wonder if the progress of LLMs won't also peter out, unless there are some fairly fundamental changes/additions to their pre-trained transformer basis. Are we just replicating the scaling experiment of Cyc, just with a fancy natural language interface?
The language and image models weren't built by people but by observing an obscene amount people going about their daily lives of producing text and images.
Expert systems were so massively oversold... and it's not at all clear that any of the "super fantastic expert" systems ever did what was claimed of them.
We definitely found out that they were, in practice, extremely difficult to build and make do anything reasonable.
The original paper on Eurisko, for instance, mentioned how the author (and founder of Cyc!) Douglas Lenat, during a run, went ahead and just hand-inserted some knowledge/results of inferences (it's been a long while since I read the paper, sorry), asserting, "Well, it would have figured these things out eventually!"
Later on, he wrote a paper titled, "Why AM and Eurisko appear to work" [0].
0: https://aaai.org/papers/00236-aaai83-059-why-am-and-eurisko-...
Does anybody have any insights into where things stand at Cycorp and any expected fallout from the world losing Doug?
Yes. It's something I've been working on, so there's at least 1 such effort. And I'm reasonably sure there are others. The idea is too obvious for there to not be other people pursuing it.
That's not true
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425828/
>Why did all these (slice of) world model approaches dead?
Because they don't work
Not yet. It's still early days.
> What other approaches exist?
Loosely speaking, I'd say this entire discussion falls into the general rubric of what people are calling "neuro-symbolic AI". Now within that there are a lot of different ways to try and combine different modalities. There are things like DeepProbLog, LogicTensorNetworks, etc.
For anybody who wants to learn more, consider starting with:
https://en.wikipedia.org/wiki/Neuro-symbolic_AI
and the videos from the previous two "Neurosymbolic Summer School" events:
Eventually the approach would be rediscovered (but not recuperated) by the database field desparate for 'new' research topics.
We might see a revival now that transformets can front and backend the hard edges of the knowledge based tech, but it will remain to be seen wether scaled monolyth systems like Cyc are the right way to pair.
Yes, back then one person could write the code base
A coworker of mine who used to work at Symbolics told me that this was endemic with Lisp development back in the day. Some customers would think there was a team of 300 doing the OS software at Symbolics. It was just 10 programmers.
Or for quality checks during training?
FWIW, KG's don't have to be brittle. Or, at least they don't have to be as brittle as they've historically been. There are approaches (like PROWL[1]) to making graphs probabilistic so that they're asserting subjective beliefs about statements, instead of absolute statements. And then the strength of those beliefs can increase or decrease in response to new evidence (per Bayes Theorem). Probably the biggest problem with this stuff is that it tends to be crazy computationally expensive.
Still, there's always the chance of an algorithmic breakthrough or just hardware improvements bringing some of this stuff into the real of practical.
Google has its own Knowledge Graph, with billions of daily views, which is wider but more shallow version of Cyc. It is unclear if LLM user facing impact surpassed that project.
https://en.wikipedia.org/wiki/Open_Mind_Common_Sense
https://en.wikipedia.org/wiki/Mindpixel
The leaders of both these projects committed suicide.
"Mathematical framework and rules of paraconsistent logic have been proposed as the activation function of an artificial neuron in order to build a neural network"
One of the immediate things I'm working on is a text to knowledge graph system. Yohei (creator of BabyAGI) is also working on text to knowledge graphs: https://twitter.com/yoheinakajima/status/1769019899245158648. LlamaIndex has a basic implementation.
This isn't quite connecting the system to an automated reasoner though. There is some research in this area, like: >>35735375
Cyc + LLMs is vaguely related to more advanced "cognitive architectures" for AI, for instance see the world model in Davidad's architecture, which LLMs can be used to help build: https://www.lesswrong.com/posts/jRf4WENQnhssCb6mJ/davidad-s-...
I had learned about "AI" in the 80's. The promise was that with lisp and expert systems and prolog and more.
the article said cyc was reading the newspaper every day.
I thought, wow, any day now computers will leap forward. The japanese 5th generation computing will be left in the dust. :)
What is obvious to you is not obvious to others. I recommend explaining and clarifying if you care about persuasion.
You've overstated/exaggerated the claim. A narrower version of the claim is more interesting and more informative. History is almost never as simple as you imply.
This assumes that all classes of problems reduce to functions which can be approximated, right, per the universal approximation theorems?
Even for cases where the UAT applies (which is not everywhere, as I show next), your caveat understates the case. There are dramatically better and worse algorithms for differing problems.
But I think a lot of people (including the comment above) misunderstand or misapply the UATs. Think about the assumptions! UATs assume a fixed length input, do they not? This breaks a correspondence with many classes of algorithms.*
## Example
Let's make a DNN that sorts a list of numbers, shall we? But we can't cheat and only have it do pairwise comparisons -- that is not the full sorting problem. We have to input the list of numbers and output the list of sorted numbers. At run-time. With a variable-length list of inputs.
So no single DNN will do! For every input length, we would need a different DNN, would we not? Training this collection of DNNs will be a whole lot of fun! It will make Bitcoin mining look like a poster-child of energy conservation. /s
* Or am I wrong? Is there a theoretical result I don't know about?
I don't have time to fully refute this claim, but it is very problematic.
1. Even a very narrow framing of how neural networks deal with inconsistent training data would perhaps warrant a paper if not a Ph.D. thesis. Maybe this has already been done? Here is the problem statement: given a DNN with a given topology trained with SGD and a given error function, what happens when you present flatly contradictory training examples? What happens when the contradiction doesn't emerge until deeper levels of a network? Can we detect this? How?
2. Do we really _want_ systems that passively tolerate inconsistent information? When I think of an ideal learning agent, I want one that would engage in the learning process and seek to resolve any apparent contradictions. I haven't actively researched this area, but I'm confident that some have, if only because Tom Mitchell at CMU emphasizes different learning paradigms in his well-known ML book. So hopefully enough people reading that think "yeah, the usual training methods for NNs aren't really that interesting ... we can do better."
3. Just because humans 'tolerate' inconsistent information in some cases doesn't mean they do so well, as compared to ideal Bayesian agents.
4. There are "GOFAI" algorithms for probabilistic reasoning that are in many cases better than DNNs.
The grand goal of AI is a general learner that can at least tackle any kind of problem we care about. Are DNNs the best performing solution for every problem? No and I agree on that. But they are applicable to a far wider range of problems. There is no question what the better general learner paradigm is.
>* Or am I wrong? Is there a theoretical result I don't know about?
Thankfully, we don't need to get into theoreticals. Go ask GPT-4 to sort an arbitrary list of numbers. Change the length and try again.
Pardon the cliffhanger style.
I have begun crafting an explanation, but not sure when it will be ready.
But when you recognize that thinking predates symbolic language, and start thinking about what thinking needs, you get closer to the answer.
I have begun crafting an explanation, but not sure when it will be ready.
But when you recognize that thinking predates symbolic language, and start thinking about what thinking needs, you get closer to the answer.
I was one of the developers/knowledge engineers of the SpinPro™ Ultracentrifugation Expert System at Beckman Instruments, Inc. This was released in 1986, developed over about 2 years. This ran on an IBM PC (DOS)! This was a technical success, but not a commercial one. (The sales force was unfamiliar with promoting a software product, and which had little impact on their commissions vs. selling multi-thousand dollar equipment.) https://pubs.acs.org/doi/abs/10.1021/bk-1986-0306.ch023 (behind ACS paywall)
Our second Expert System was PepPro™, which designed procedures for the chemical synthesis of peptides (essentially very small proteins). This was completed and to be released in 1989, but Beckman discontinued their peptide synthesis instrument product line just two months before. This system was able to integrate end-user knowledge with the built-in domain knowledge. PepPro was recognized in the first AAAI Conference on Innovative Applications of Artificial Intelligence in 1989. https://www.aaai.org/Papers/IAAI/1989/IAAI89-010.pdf
Both of these were developed in Interlisp-D on Xerox 1108/1186 workstations, using an in-house expert system development environment, and deployed in Gold Hills Common Lisp for the PC.