I (like vannevar apparently) didn't feel Cyc was going anywhere useful, there were ideas there, but not coherent enough to form a credible basis for even a hypothesis of how a system could be constructed that would embody them.
I was pretty impressed by McCarthy's blocks world demo, later he and a student formalized some of the rules for creating 'context'[1] for AI to operate within, I continue to think that will be crucial to solving some of the mess that LLMs create.
For example, the early failures of LLMs suggesting that you could make salad crunchy by adding rocks was a classic context failure, data from the context of 'humor' and data from the context of 'recipes' intertwined. Because existing models have no context during training, there is nothing in the model that 'tunes' the output based on context. And you get rocks in your salad.
[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d...
This seems like a high bar to reach.
We all know that symbolic AI didn't scale as well as LLMs trained on huge amounts of data. However, as you note, it also tried to address many things that LLMs still don't do well.
I suspect that McCarthy was on to something with the context thing. Organic intelligence certainly fails in creative ways without context it would not be disqualifying to have AI fail in similarly spectacular ways.
[1] I made a bit of progress on this considering it to be the permeability for progress such that the higher the weight the easier it was to 'pass thorough' this particular neuron but the cyclic nature of the graph makes a purely topological explanation pretty obtuse :-).
Neural networks, not LLMs in particular, were just about the simplest thing that could scale - they scaled and everything else has been fine-tuning. Symbolic AI basically begins with existing mathematical models of reality and of human reason and indeed didn't scale.
The problem imo is: The standard way mathematical modeling works[2] is you have a triple of <data, model-of-data, math-formalism>. The math formalism characterizes what the data could be, how data diverges from reality etc. The trouble is that the math formalism really doesn't scale even if a given model scales[3]. So even if you were to start plugging numbers into some other math model and get a reality-approximation like an LLM, it would be a black box like an LLM because the meta-information would be just as opaque.
Consider the way Judea Pearl rejected confidence intervals and claimed probabilities were needed as the building blocks for approximate reasoning systems. But a look at human beings, animals or LLMs shows that things that "deal with reality" don't have and couldn't access to "real" probabilities.
I'd just offer that I believe that for a model to scale, the vast majority of it's parameters would have to be mathematically meaningless to us. And that's for the above reasons.
[1]. Really key point, imo [2]. That innclude symbolic and probabilistic model "at the end of the day" [3]. Contrast the simplicity of plugging data into a regression model versus the multitudes of approaches explaining regression and loss/error functions etc.