You are here

"Watson -- come here -- I want to see you."

There was a nice Discover Magazine article Monday on the IBM Jeopardy-playing computer Watson. The author was kind enough to interview me and quote me extensively and accurately on the topic; I thought the piece as a whole was quite well written and extremely informative. (I also got a silly one-sentence quote in an AP piece that was carried widely today.)

I thought maybe I'd post a few comments on Watson and the just-concluded Jeopardy match (spoiler alert) here… "Watson" is a brilliant name for this machine. Ostensibly, of course, the name refers to IBM founder Thomas Watson. In the public mind, the name refers to the great detective Sherlock Holmes and his famous assistant and narrator Dr. John Watson. For me, though, there's a clear reference to another Thomas Watson: the assistant of Alexander Graham Bell so famously summoned in the first telephone conversation. Just as electrical hardware revolutionized human communication, modern natural language text retrieval and analysis software is likely to revolutionize human information management; Watson is now the symbol of this work in the same way that the Bell telephone was.

The impressive nature of Watson's three-day performance does not obscure the fact that Watson has some serious weaknesses left to tackle. It also is marred, to some extent, by the inevitable compromises between computer and human play.

I have to wonder how Watson would have done in a contest where quick button-pressing was not a concern. Building a fast button-pressing computer has its challenges, but I think we should expect a little more. Indeed, Jennings expressed as much before the final day of the match.

I also wonder how Watson would have done in a heads-up contest with either of the human champions. Having contestants Jennings and Rutter split the answers it found difficult was definitely in Watson's favor.

The human champions had to read and listen to the clues. Watson was fed them as text. Certainly, the state of computer text and speech recognition is strong enough that the IBM team could have made Watson use a camera and a microphone. I have to wonder whether the decision not to was just an attempt to avoid work, or a time compromise: it undoubtedly also helped a lot with getting the buzzer timing right.

Hosting Watson on a Power supercomputer was quite possibly gratuitous, and arguably uneven. Perhaps only contestants who can sit (in their entirety) in the chair should be allowed to play.

Finally, note that Watson never did and never could fail to respond in the form of a question.

Fairness objections sound like nitpicking, and perhaps they are. More interesting is the question of how well Watson really played. Answer: very well indeed, but certainly troubled in ways that showed its pedigree.

The answer of "Toronto????" in Final Jeopardy shouldn't really have bothered anyone, including Alex. Players do that all the time on that show; it knew it had a wrong "question", but decided a guess was better than nothing. Perhaps the Watson miss that surprised me most was on the first day. It was clear from its failed candidates that Watson understood that the "answer" about You-Know-Who referred to the Harry Potter stories. It was commendable that its confidence estimates were low enough that it avoided its ridiculous top answer, "Harry Potter" himself. But really, Lord Voldemort is a pretty famous, easy choice. If you analyze the "answer", what you see is that the grammar was pretty confusing. I suspect Watson's natural-language parser just wasn't up to the task of figuring the language out. A lot of the problems set it during the tournament could be solved by identifying key phrases in the question and doing standard IR. This one was harder.

In the final round today, Watson really fell apart. It never did figure out one of the categories, perhaps because it couldn't listen to the human competitors to understand what the correct answers were. It was often completely lost.

The famous chess match in which IBM's Deep Blue defeated Kasparov some years ago had a lot of the same features. The match was short, and the playing field was arguably not level. Kasparov ultimately lost through making a single glaring mistake in one game; something that no competent chess program will ever do. Nonetheless, the victory was symbolic and important for IBM and for computer chess.

At the time of the Deep Blue match, Kasparov accused the IBM team of cheating by having their human grandmaster feed Deep Blue moves. In 2007, Victor Kramnik was accused of cheating in a World Championship chess match by his opponent Veselin Topalov—by sneaking out to the restroom to get moves from a laptop running Deep Fritz. While the allegation was never proven, it at least seemed plausible to all concerned. This was the moment that I believed that computers were now the chess champions.

When a great Jeopardy champion is plausibly accused of cheating by sneaking a peek at the Watson in his or her pocket, it will be time to concede Jeopardy to the computers. Until then, the recently-concluded match represents a spectacular peek at the future of natural language information retrieval and query processing. Fob