I'd bet, judging mostly from my failed attempts at playing with OpenCyc around 2009, is that the Cyc has always been too closed and to complex to tinker with. That doesn't play nicely with academic work. When people finish their PhDs and start working for OpenAI, they simply don't have Cyc in their toolbox.
[1] https://www.sciencedirect.com/science/article/pii/S089360802...
Now it's clear that knowledge graphs are far inferior to deep neural nets, but even still few people can explain the _root_ reason why.
I don't think Lenat's bet was a waste. I think it was sensible based on the information at the time.
The decision to research it largely in secret, closed source, I think was a mistake.
No. It depends. In general, two technologies can’t be assessed independently of the application.
_The_ (one) root reason? Ok, I’ll bite.
But you need to define your claim. What application?
[1] https://voidfarer.livejournal.com/623.html
You can label it "bad idea" but you can't bring LLMs back in time.
In fact, if you have a graph and a path-weighting model (RNN, TDCNN or Transformer), you can use beam search to evaluate paths through graphs.
There aren't any class of problems deep nets can't handle. Will they always be the most efficient or best performing solution ? No, but it will be possible.
and we don't call it hallucinations but gofai mispredicts plenty.
However I think they have a good excuse for 'Why didn't it ever have the impact that LLMs are having now?': lack of data and lack of compute.
And it's the same excuse that neural networks themselves have: back in those days, we just didn't have enough data, and we didn't have enough compute, even if we had the data.
(Of course, we learned in the meantime that neural networks benefit a lot from extra data and extra compute. Whether that can be brought to bear on Cyc-style symbolic approaches is another question.)
Cyc also has the equivalent of hallucinations, when their definitions don't cleanly apply to the real world.
Cyc was able to produce an impact, I keep pointing to MathCraft [1] which, at 2017, did not have a rival in the neural AI.
[1] https://www.width.ai/post/what-is-beam-search
It is possible to even have 3-gram model to output better text predictions if you combine it with the beam search.
I'll tell you how I know /
I read it in the paper /
Fifteen years ago -
(John Prine)
If that is so then symbolic AI does not easily scale because you cannot feed inconsistent information into it. Compare this to how humans and LLMs learn, they both have no problem with inconsistent information. Yet statistically speaking humans can easily produce "useful" information.
FWIW, KG's don't have to be brittle. Or, at least they don't have to be as brittle as they've historically been. There are approaches (like PROWL[1]) to making graphs probabilistic so that they're asserting subjective beliefs about statements, instead of absolute statements. And then the strength of those beliefs can increase or decrease in response to new evidence (per Bayes Theorem). Probably the biggest problem with this stuff is that it tends to be crazy computationally expensive.
Still, there's always the chance of an algorithmic breakthrough or just hardware improvements bringing some of this stuff into the real of practical.
Google has its own Knowledge Graph, with billions of daily views, which is wider but more shallow version of Cyc. It is unclear if LLM user facing impact surpassed that project.
"Mathematical framework and rules of paraconsistent logic have been proposed as the activation function of an artificial neuron in order to build a neural network"
What is obvious to you is not obvious to others. I recommend explaining and clarifying if you care about persuasion.
You've overstated/exaggerated the claim. A narrower version of the claim is more interesting and more informative. History is almost never as simple as you imply.
This assumes that all classes of problems reduce to functions which can be approximated, right, per the universal approximation theorems?
Even for cases where the UAT applies (which is not everywhere, as I show next), your caveat understates the case. There are dramatically better and worse algorithms for differing problems.
But I think a lot of people (including the comment above) misunderstand or misapply the UATs. Think about the assumptions! UATs assume a fixed length input, do they not? This breaks a correspondence with many classes of algorithms.*
## Example
Let's make a DNN that sorts a list of numbers, shall we? But we can't cheat and only have it do pairwise comparisons -- that is not the full sorting problem. We have to input the list of numbers and output the list of sorted numbers. At run-time. With a variable-length list of inputs.
So no single DNN will do! For every input length, we would need a different DNN, would we not? Training this collection of DNNs will be a whole lot of fun! It will make Bitcoin mining look like a poster-child of energy conservation. /s
* Or am I wrong? Is there a theoretical result I don't know about?
I don't have time to fully refute this claim, but it is very problematic.
1. Even a very narrow framing of how neural networks deal with inconsistent training data would perhaps warrant a paper if not a Ph.D. thesis. Maybe this has already been done? Here is the problem statement: given a DNN with a given topology trained with SGD and a given error function, what happens when you present flatly contradictory training examples? What happens when the contradiction doesn't emerge until deeper levels of a network? Can we detect this? How?
2. Do we really _want_ systems that passively tolerate inconsistent information? When I think of an ideal learning agent, I want one that would engage in the learning process and seek to resolve any apparent contradictions. I haven't actively researched this area, but I'm confident that some have, if only because Tom Mitchell at CMU emphasizes different learning paradigms in his well-known ML book. So hopefully enough people reading that think "yeah, the usual training methods for NNs aren't really that interesting ... we can do better."
3. Just because humans 'tolerate' inconsistent information in some cases doesn't mean they do so well, as compared to ideal Bayesian agents.
4. There are "GOFAI" algorithms for probabilistic reasoning that are in many cases better than DNNs.
The grand goal of AI is a general learner that can at least tackle any kind of problem we care about. Are DNNs the best performing solution for every problem? No and I agree on that. But they are applicable to a far wider range of problems. There is no question what the better general learner paradigm is.
>* Or am I wrong? Is there a theoretical result I don't know about?
Thankfully, we don't need to get into theoreticals. Go ask GPT-4 to sort an arbitrary list of numbers. Change the length and try again.
Pardon the cliffhanger style.
I have begun crafting an explanation, but not sure when it will be ready.
But when you recognize that thinking predates symbolic language, and start thinking about what thinking needs, you get closer to the answer.