zlacker

[return to "Doug Lenat has died"]
1. dredmo+83[view] [source] 2023-09-01 18:00:35
>>snewma+(OP)
Cyc ("Syke") is one of those projects I've long found vaguely fascinating though I've never had the time / spoons to look into it significantly. It's an AI project based on a comprehensive ontology and knowledgebase.

Wikipedia's overview: <https://en.wikipedia.org/wiki/Cyc>

Project / company homepage: <https://cyc.com/>

◧◩
2. jfenge+B7[view] [source] 2023-09-01 18:23:24
>>dredmo+83
I worked with Cyc. It was an impressive attempt to do the thing that it does, but it didn't work out. It was the last great attempt to do AI in the "neat" fashion, and its failure helped bring about the current, wildly successful "scruffy" approaches to AI.

It's failure is no shade against Doug. Somebody had to try it, and I'm glad it was one of the brightest guys around. I think he clung on to it long after it was clear that it wasn't going to work out, but breakthroughs do happen. (The current round of machine learning itself is a revival of a technique that had been abandoned, but people who stuck with it anyway discovered the tricks that made it go.)

◧◩◪
3. dredmo+Od[view] [source] 2023-09-01 19:01:00
>>jfenge+B7
"Neat" vs. "scruffy" syncs well with my general take on Cyc. Thanks for that.

I do suspect that well-curated and hand-tuned corpora, including possibly Cyc's, are of significant use to LLM AI. And will likely be more so as the feedback / autophagy problem exacerbates.

◧◩◪◨
4. pwilli+do[view] [source] 2023-09-01 19:56:20
>>dredmo+Od
Wow -- I hadn't thought of this but makes total sense. We'll need giant definitely-human-curated databases of information for AIs to consume as more information becomes generated by the AIs.
◧◩◪◨⬒
5. dredmo+At[view] [source] 2023-09-01 20:27:59
>>pwilli+do
There's a long history of informational classification, going back to Aristotle and earlier ("Categories"). See especially Melville Dewey, the US Library of Congress Classification, and the work of Paul Otlet. All are based on exogenous classification, that is, subjects and/or works classification catalogues which are independent of the works classified.

Natural-language content-based classification as by Google and Web text-based search relies effectively on documents self-descriptions (that is, their content itself) to classify and search works, though a ranking scheme (e.g., PageRank) is typically layered on top of that. What distinguished early Google from prior full-text search was that the latter had no ranking criteria, leading to keyword stuffing. An alternative approach was Yahoo, originally Yet Another Hierarchical Officious Oracle, which was a curated and ontological classification of websites. This was already proving infeasible by 1997/98 as a whole, though as training data for machine classification might prove useful.

[go to top]