What is Artificial Intelligence? Is it
really that magic wand, that “ghost in the machine” that can” intelligently”
solve all the problems that common “dumb” engineering can’t? In more than 45
years of history, AI’s attempts to build intelligent machines haven’t always met
the expectations, causing the so called “AI winter”. In fact traditional knowledge-based
AI always suffered of severe scalability and flexibility limitations that impaired
its effective application to complex real-life solutions, and often escaped
benchmark comparison with other, often more effective, techniques. On the contrary, data-driven technology and
modern software engineering—though less glamorous than AI--have brought solid
results, especially in the field of spoken interaction with machines. Although
some serious attempts to bring traditional AI notions back to life from its
last winter are under way in some academic research establishments, only a fair
benchmark comparison against mainstream technology with measurable results can
prove its superior performance.
I recently happened to hear the term AI—as in Artificial
Intelligence—quite more often than during the past twenty years. I found that a
little bit odd; I thought that no one with an historical perspective of science
and technology would be using the term as nonchalantly as we did in the
nineteen-seventies and eighties. I tend to associate the term AI with other
terms of the past, like “electronic brain”, “thinking computer”, “cybernetics”,
and “the information superhighway”. Today you don’t hear “Look for my page on the
information superhighway” …it is so 1990s… None of my friends who were somehow connected
with AI in the good old days call themselves AI experts anymore. Rather, they
talk about disciplines like “cognitive sciences” or “machine learning”, use
techniques like “Support Vector Machines”, or Markov Random Fields” or
unglamorously say “I use statistics” or “I do computer science” during party
chats. What happened to AI? Where has it gone?
Back in the days when computers where mostly doing arithmetic
calculations—it was the summer of 1956—there was a workshop at the University
of Dartmouth where almost all of the computer pioneers in the known world —only
a bunch of them at the time—met for two months—good old times…when one could
actually go away for two months—to discuss advanced and unconventional computer
programs that were able to prove theorems, play chess, and recognize all kinds
of patterns. The workshop came to be known as The Dartmouth Summer Research Project on Artificial Intelligence; it
was the first time that the term was ever used and it stuck. Since then
Artificial Intelligence, or simply AI, is used to denote a way to approach the
solution of problems by a machine “similar” to how we believe intelligent
creatures, like most of us humans, do. And I stress the term “believe”, because
we don’t know for sure how we, or better our brains, solve such problems.
After we humans have reached the solution of a problem—let’s
say proven a theorem or solved a murder mystery—we are often, but not always,
able to consciously reverse engineer the solution process that we (believe we
have) applied step by step like a detective at the end of the movie—think of
Monsieur Poirot or Columbo. Yet sometimes we think we solve problems by consciously
applying rules and using reasoning and inference towards the solution of the
problem. But other times we don’t (Malcom Gladwell’s Blink supports that). And this is especially
true for things for which AI never worked really well, or never proved to
outperform other non-AIsh approaches.
Take chess, for instance: one of the reference problems of classical
AI. IBM’s Deep Blue, the first computer that won, in 1997, against a human
world master—Garry Kasparov—did not use the elegant inference techniques that
nostalgic AI aficionados would refer to as AI. Rather Deep Blue used the “brute
force” of a computational power able to evaluate 200 million positions per
second in a database of 700,000 grandmaster games. Probably if you tried to
rationalize, after-the-fact, why Deep Blue won, in the traditional AI style,
you won’t find a “step-by-step” conscious process of inference. Most likely
Deep Blue won because it was faster, had more memorized games readily available
than its human opponent, and could go deeper
in analyzing all the possible effects of a move. And I am sure if you had asked
Garry Kasparov, right after the game, why he lost, he could have probably tried
to rationalize, and given you a step-by-step explanation, but not certainly a
set of rules that one can apply in an AI-like system to build a better chess
player than him.
Language and speech interpretation is traditionally another
field where AI—and I mean the traditional rule-based, knowledge-based AI—never
proved to achieve better results than techniques with a less glamorous names
like Hidden Markov Models, N-gram statistics, or Finite State Transducers.
Since the first speech recognition machine was built at Bell Laboratories in
1952, hundreds of researchers across the globe tried to use traditional
knowledge-based AI techniques to interpret the content of the voice signal. All
the others tried less elegant statistical, data driven, approaches, and built
systems that actually worked. Knowledge based speech recognition never
succeeded, and although a few serious scientists are trying today to revive it
and marry it that with the statistical approach—a long term research topic—I
have heard no one talking about it at any of the most prestigious international
academic conferences in the field.
So …is AI gone? I don’t think so. AI, as the attempt to create
a deeper understanding of problems towards their solution inside a machine is not
gone. On the contrary there are many serious scientists that are relentlessly
working on what in the 1980s would have been called an AI-sh type of solutions.
What is gone, at least we hope, is the popular belief that AI is a sort of magic
wand which, in virtue of the intelligence
hidden in its guts, can solve problems as humans do (do they?) and that will provide
better, faster, and cheaper solution development and maintenance. The term AI
used to represent, still in the popular belief corroborated by science fiction
and superficial third page stories, the panacea of all automation
problems, mainly because of the term intelligence
in the name. That created expectations that could never be met and caused
what is known today as the AI
winter. We hope we have finally grown out of this popular belief.
The problems with traditional, classical AI are many. Classical
AI proved to be non-scalable, since knowledge had to be put into the system by
hand (see the main criticisms to the ambitious Cyc project at http://en.wikipedia.org/wiki/Cyc). Statistical
machine learning, instead, gains knowledge automatically, from data. Classical AI
traditionally escaped any type of meaningful comparison benchmark against other
techniques. Science can be called so when it is driven by data and
measurements. In the absence of that, what is left is anecdotic evidence: Yes
it works! … but does it work better? How better? Is it cheaper? How cheaper?
Measurements and common benchmarks are ate the basis of today’s machine
automation.
On top of the lack of scalability and the absence of
measurable performance on common benchmarks—and I am still thinking of
knowledge-based, rule-based, reasoning, inference-based, introspective, old AI—
one of the main drawbacks that hindered, and still does, the penetration of AI
philosophy into areas like speech applications is the fact that it traditionally
trades procedural expressivity for built-in behavior. Let me be more explicit.
One of the claims of the AI knowledge-based approach is that “you just express
the knowledge, and the inference engine uses it in an intelligent way” – so building complex applications may seem less costly,
at least on the surface. One may claim the same for using AI in spoken dialog
applications. No coding, no call-flows—just write down the knowledge, and the
rest will be done by the engine. That’s it! Unfortunately the behavior of intelligent inference
engines—like old Prolog’s inference engine—is often not easy to grasp except
for those who designed it. Training developers—today’s software developers are universally
fluent in procedural or object-oriented programming and not in inference-based
programming—to use AI-like engines can be quite difficult. Especially
considering that, in situations where knowledge is vast and not always
consistent—rules can contradict each other, they may require to be invoked with
some temporal order, they may be incomplete, etc.—the behavior of inference
engines may not be predictable. That goes against the VUI
Completeness principle which requires that all possible outcomes of a Voice
User Interface should be predictable before a spoken dialog application goes
into production. And what about last minute change requested by the customer?
For instance changing the order in which questions are asked, or changing a
procedure according to the company’s best practices? With knowledge-based AI-sh
systems a simple change like that can easily become a non-reusable fix or a development
nightmare, because one has to bypass the built-in engine behavior with some
ad-hoc procedure.
As a consequence of the above considerations, while serious and
illustrious researcher have tried for decades to apply inference techniques to
spoken dialog systems, and with considerable academic successes (for instance
Plan-based dialog at the University of Rochester, or Agenda-based dialog at
CMU), the industry still holds on to procedural techniques such as the
call-flow representation. Call-flows abstractions, because they are procedural,
are easily and naturally grasped by VUI designers and developers who can build
sophisticated applications to solve customer problems. And after all, as someone who has built real
spoken dialog applications knows very well, dialog design and development is
only one of the elements to determine the success of an application.
Integration, platform robustness, speech accuracy, and a myriad of other little,
and not so little, things need to be in place for a system to work, to be cost
effective, and to provide quality customer experience.
So what’s the future of complex spoken dialog applications?
How will they evolve? Will they proceed following the path of traditional AI,
with an intelligent engine in the
background able to reason on a database of well structured knowledge? Well, the
evidence is against that. Complex applications, so far, did not evolve towards the
AI-sh inference way of solving problems. I doubt that sophisticated Web sites
that interface complex applications which show some level of intelligence have
AI-sh inference engines behind them. Yes knowledge needs to be separated by its
usage, but that’s a fundamental rule that every good procedural programmer
learns early enough. Do you want to call it Artificial Intelligence? Or maybe we
can call it model-view-controller (MVC) style? But without an intelligent
engine … how do you handle complexity and cost of development? Software—and
call-flows are software—found its
own way to handle complexity with modularity, encapsulation, inheritance, polymorphism, and other
programmer’s tricks And that’s not AI.
Classical AI, of the inference-resoning-konwledge-based variety,
might come back at some point from its winter hibernation–we do hope so—and it
may confront other approaches using comparison benchmarks in a scientific
manner, and it may even win. But until then, we have to settle for the “unglamorous”
technologies.