Cyc - zlacker

>>mdszy+(OP)
I worked for Cycorp for a few years recently. AMA, I guess? I obviously won't give away any secrets (e.g. business partners, finer grained details of how the inference engine works), but I can talk about the company culture, some high level technical things and the interpretation of the project that different people at the company have that makes it seem more viable than you might guess from the outside.

There were some big positives. Everyone there is very smart and depending on your tastes, it can be pretty fun to be in meetings where you try to explain Davidsonian ontology to perplexed business people. I suspect a decent fraction of the technical staff are reading this comment thread. There are also some genuine technical advances (which I wish were more publicly shared) in inference engine architecture or generally stemming from treating symbolic reasoning as a practical engineering project and giving up on things like completeness in favor of being able to get an answer most of the time.

There were also some big negatives, mostly structural ones. Within Cycorp different people have very different pictures of what the ultimate goals of the project are, what true AI is, and how (and whether) Cyc is going to make strides along the path to true AI. The company has been around for a long time and these disagreements never really resolve - they just sort of hang around and affect how different segments of the company work. There's also a very flat organizational structure which makes for a very anarchic and shifting map of who is responsible or accountable for what. And there's a huge disconnect between what the higher ups understand the company and technology to be doing, the projects they actually work on, and the low-level day-to-day work done by programmers and ontologists there.

I was initially pretty skeptical of the continued feasibility of symbolic AI when I went in to interview, but Doug Lenat gave me a pitch that essentially assured me that the project had found a way around many of the concerns I had. In particular, they were doing deep reasoning from common sense principles using heuristics and not just doing the thing Prolog often devolved into where you end up basically writing a logical system to emulate a procedural algorithm to solve problems.

It turns out there's a kind of reality distortion field around the management there, despite their best intentions - partially maintained by the management's own steadfast belief in the idea that what Cyc does is what it ought to be doing, but partially maintained by a layer of people that actively isolate the management from understanding the dirty work that goes into actually making projects work or appear to. So while a certain amount of "common sense" knowledge factors into the reasoning processes, a great amount of Cyc's output at the project level really comes from hand-crafted algorithms implemented either in the inference engine or the ontology.

Also the codebase is the biggest mess I have ever seen by an order of magnitude. I spent some entire days just scrolling through different versions of entire systems that duplicate massive chunks of functionality, written 20 years apart, with no indication of which (if any) still worked or were the preferred way to do things.

>>TheDon+kR
If there are current employees reading, they might be able to give a better answer than me. Basically, the project is to build a huge knowledge base of basic facts and "common sense" knowledge and an inference engine that could use a lot of different heuristics (including ones derived from semantic implications of contents of the knowledge base) to do efficient inference on queries related to its knowledge. One way of looking at Cyc from a business point of view is that it's a kind of artificial analyst sitting between you and a database. The database has a bunch of numbers and strings and stuff in a schema to represent facts. You can query the database. But you can ask an analyst much broader questions that require outside knowledge and deeper semantic understanding of the implications of the kinds of facts in the database, and then they go figure out what queries to make in order to answer your question - Cyc sort of does that job.

The degree to which it's effective seemed to me to be a case-by-case thing. While working there I tended to suspect that Cyc people underestimated the degree to which you could get a large fraction of their results using something like Datomic and it was an open question (to me at least) whether the extra 10% or whatever was worth how much massively more complicated it is to work with Cyc. I might be wrong though, I kind of isolated myself from working directly with customers.

One issue is just that "useful" always invites the question "useful to whom?"

Part of the tension of the company was a distinction between their long term project and the work that they did to pay the bills. The long term goal was something like, to eventually accumulate enough knowledge to create something that could be the basis for a human-ish AI. Whether that's useful, or their approach to it was useful, is a matter for another comment. But let's just say, businesses rarely show up wanting to pay you for doing that directly, so part of the business model is just finding particular problems that they were good at (lots of data, lots of basic inference required using common sense knowledge) that other companies weren't prepared to do. Some clients found Cyc enormously useful in that regard, others were frustrated by the complexity of the system.