Blog: Dialing for Dialog

« The complexity ceiling | Main | We are all connected »

02/26/2008

AI

MetropolisWhat is Artificial Intelligence? Is it really that magic wand, that “ghost in the machine” that can” intelligently” solve all the problems that common “dumb” engineering can’t? In more than 45 years of history, AI’s attempts to build intelligent machines haven’t always met the expectations, causing the so called “AI winter”. In fact traditional knowledge-based AI always suffered of severe scalability and flexibility limitations that impaired its effective application to complex real-life solutions, and often escaped benchmark comparison with other, often more effective, techniques. On the contrary, data-driven technology and modern software engineering—though less glamorous than AI--have brought solid results, especially in the field of spoken interaction with machines. Although some serious attempts to bring traditional AI notions back to life from its last winter are under way in some academic research establishments, only a fair benchmark comparison against mainstream technology with measurable results can prove its superior performance.

I recently happened to hear the term AI—as in Artificial Intelligence—quite more often than during the past twenty years. I found that a little bit odd; I thought that no one with an historical perspective of science and technology would be using the term as nonchalantly as we did in the nineteen-seventies and eighties. I tend to associate the term AI with other terms of the past, like “electronic brain”, “thinking computer”, “cybernetics”, and “the information superhighway”. Today you don’t hear “Look for my page on the information superhighway” …it is so 1990s…  None of my friends who were somehow connected with AI in the good old days call themselves AI experts anymore. Rather, they talk about disciplines like “cognitive sciences” or “machine learning”, use techniques like “Support Vector Machines”, or Markov Random Fields” or unglamorously say “I use statistics” or “I do computer science” during party chats. What happened to AI? Where has it gone?

Back in the days when computers where mostly doing arithmetic calculations—it was the summer of 1956—there was a workshop at the University of Dartmouth where almost all of the computer pioneers in the known world —only a bunch of them at the time—met for two months—good old times…when one could actually go away for two months—to discuss advanced and unconventional computer programs that were able to prove theorems, play chess, and recognize all kinds of patterns. The workshop came to be known as The Dartmouth Summer Research Project on Artificial Intelligence; it was the first time that the term was ever used and it stuck. Since then Artificial Intelligence, or simply AI,  is used to denote a way to approach the solution of problems by a machine “similar” to how we believe intelligent creatures, like most of us humans, do. And I stress the term “believe”, because we don’t know for sure how we, or better our brains, solve such problems.

After we humans have reached the solution of a problem—let’s say proven a theorem or solved a murder mystery—we are often, but not always, able to consciously reverse engineer the solution process that we (believe we have) applied step by step like a detective at the end of the movie—think of Monsieur Poirot or Columbo. Yet sometimes we think we solve problems by consciously applying rules and using reasoning and inference towards the solution of the problem. But other times we don’t (Malcom Gladwell’s Blink supports that). And this is especially true for things for which AI never worked really well, or never proved to outperform other non-AIsh approaches.

Take chess, for instance: one of the reference problems of classical AI. IBM’s Deep Blue, the first computer that won, in 1997, against a human world master—Garry Kasparov—did not use the elegant inference techniques that nostalgic AI aficionados would refer to as AI. Rather Deep Blue used the “brute force” of a computational power able to evaluate 200 million positions per second in a database of 700,000 grandmaster games. Probably if you tried to rationalize, after-the-fact, why Deep Blue won, in the traditional AI style, you won’t find a “step-by-step” conscious process of inference. Most likely Deep Blue won because it was faster, had more memorized games readily available than its human opponent, and could go deeper in analyzing all the possible effects of a move. And I am sure if you had asked Garry Kasparov, right after the game, why he lost, he could have probably tried to rationalize, and given you a step-by-step explanation, but not certainly a set of rules that one can apply in an AI-like system to build a better chess player than him.

Language and speech interpretation is traditionally another field where AI—and I mean the traditional rule-based, knowledge-based AI—never proved to achieve better results than techniques with a less glamorous names like Hidden Markov Models, N-gram statistics, or Finite State Transducers. Since the first speech recognition machine was built at Bell Laboratories in 1952, hundreds of researchers across the globe tried to use traditional knowledge-based AI techniques to interpret the content of the voice signal. All the others tried less elegant statistical, data driven, approaches, and built systems that actually worked. Knowledge based speech recognition never succeeded, and although a few serious scientists are trying today to revive it and marry it that with the statistical approach—a long term research topic—I have heard no one talking about it at any of the most prestigious international academic conferences in the field. 

So …is AI gone? I don’t think so. AI, as the attempt to create a deeper understanding of problems towards their solution inside a machine is not gone. On the contrary there are many serious scientists that are relentlessly working on what in the 1980s would have been called an AI-sh type of solutions. What is gone, at least we hope, is the popular belief that AI is a sort of magic wand which, in virtue of the intelligence hidden in its guts, can solve problems as humans do (do they?) and that will provide better, faster, and cheaper solution development and maintenance. The term AI used to represent, still in the popular belief corroborated by science fiction and superficial third page stories, the panacea of all automation problems, mainly because of the term intelligence in the name. That created expectations that could never be met and caused what is known today as the AI winter. We hope we have finally grown out of this popular belief.

The problems with traditional, classical AI are many. Classical AI proved to be non-scalable, since knowledge had to be put into the system by hand (see the main criticisms to the ambitious Cyc project at http://en.wikipedia.org/wiki/Cyc). Statistical machine learning, instead, gains knowledge automatically, from data. Classical AI traditionally escaped any type of meaningful comparison benchmark against other techniques. Science can be called so when it is driven by data and measurements. In the absence of that, what is left is anecdotic evidence: Yes it works! … but does it work better? How better? Is it cheaper? How cheaper? Measurements and common benchmarks are ate the basis of today’s machine automation. 

On top of the lack of scalability and the absence of measurable performance on common benchmarks—and I am still thinking of knowledge-based, rule-based, reasoning, inference-based, introspective, old AI— one of the main drawbacks that hindered, and still does, the penetration of AI philosophy into areas like speech applications is the fact that it traditionally trades procedural expressivity for built-in behavior. Let me be more explicit. One of the claims of the AI knowledge-based approach is that “you just express the knowledge, and the inference engine uses it in an intelligent way” – so building complex applications may seem less costly, at least on the surface. One may claim the same for using AI in spoken dialog applications. No coding, no call-flows—just write down the knowledge, and the rest will be done by the engine. That’s it!  Unfortunately the behavior of intelligent inference engines—like old Prolog’s inference engine—is often not easy to grasp except for those who designed it. Training developers—today’s software developers are universally fluent in procedural or object-oriented programming and not in inference-based programming—to use AI-like engines can be quite difficult. Especially considering that, in situations where knowledge is vast and not always consistent—rules can contradict each other, they may require to be invoked with some temporal order, they may be incomplete, etc.—the behavior of inference engines may not be predictable. That goes against the VUI Completeness principle which requires that all possible outcomes of a Voice User Interface should be predictable before a spoken dialog application goes into production. And what about last minute change requested by the customer? For instance changing the order in which questions are asked, or changing a procedure according to the company’s best practices? With knowledge-based AI-sh systems a simple change like that can easily become a non-reusable fix or a development nightmare, because one has to bypass the built-in engine behavior with some ad-hoc procedure.

As a consequence of the above considerations, while serious and illustrious researcher have tried for decades to apply inference techniques to spoken dialog systems, and with considerable academic successes (for instance Plan-based dialog at the University of Rochester, or Agenda-based dialog at CMU), the industry still holds on to procedural techniques such as the call-flow representation. Call-flows abstractions, because they are procedural, are easily and naturally grasped by VUI designers and developers who can build sophisticated applications to solve customer problems.  And after all, as someone who has built real spoken dialog applications knows very well, dialog design and development is only one of the elements to determine the success of an application. Integration, platform robustness, speech accuracy, and a myriad of other little, and not so little, things need to be in place for a system to work, to be cost effective, and to provide quality customer experience.   

So what’s the future of complex spoken dialog applications? How will they evolve? Will they proceed following the path of traditional AI, with an intelligent engine in the background able to reason on a database of well structured knowledge? Well, the evidence is against that. Complex applications, so far, did not evolve towards the AI-sh inference way of solving problems. I doubt that sophisticated Web sites that interface complex applications which show some level of intelligence have AI-sh inference engines behind them. Yes knowledge needs to be separated by its usage, but that’s a fundamental rule that every good procedural programmer learns early enough. Do you want to call it Artificial Intelligence? Or maybe we can call it model-view-controller (MVC) style? But without an intelligent engine … how do you handle complexity and cost of development? Software—and call-flows are software—found its own way to handle complexity with modularity, encapsulation, inheritance, polymorphism, and other programmer’s tricks  And that’s not AI.

Classical AI, of the inference-resoning-konwledge-based variety, might come back at some point from its winter hibernation–we do hope so—and it may confront other approaches using comparison benchmarks in a scientific manner, and it may even win. But until then, we have to settle for the “unglamorous” technologies. 

Posted by Roberto on Feb 26, 2008 8:47:21 AM Permalink

Comments

Hello forum! I'm new here, just wanted to stop in and say hi for the first time :)

I hope everyone is having a great day! Toodle ooo :)

Posted by: accurrimicy | 06/25/2009 at 10:09 PM

Post a comment

*Name:
*Email Address:
*Comments: