However I think they have a good excuse for 'Why didn't it ever have the impact that LLMs are having now?': lack of data and lack of compute.
And it's the same excuse that neural networks themselves have: back in those days, we just didn't have enough data, and we didn't have enough compute, even if we had the data.
(Of course, we learned in the meantime that neural networks benefit a lot from extra data and extra compute. Whether that can be brought to bear on Cyc-style symbolic approaches is another question.)
[1] https://www.width.ai/post/what-is-beam-search
It is possible to even have 3-gram model to output better text predictions if you combine it with the beam search.