Yesterday, I wrote about an article in the Wall Street Journal, by Yale computer science Professor David Gelernter, on the approach used by Watson, IBM’s computer system that plays Jeopardy!. The Journal also has an excerpt from Stephen Baker’s new book, Final Jeopardy, which discusses some of the background of the project, and its early trials. (I mentioned Mr. Baker’s book, and his blog, in an earlier post.) As he reports, Watson is by no means infallible; the system sometimes misses an allusion that a person would catch, and sometimes it just comes up with wacky answers. His account of some of Watson’s history and “growing pains” is also interesting.
The New York Times also has an article, by the novelist Richard Powers, on Watson. Mr. Powers observes that, although the upcoming match between Watson and Jeopardy! champions Ken Jennings and Brad Rutter is undoubtedly a stunt designed to capture the public’s interest (it certainly seems to have captured the media’s), it is nonetheless work at the forefront of Artificial Intelligence research.
Open-domain question answering has long been one of the great holy grails of artificial intelligence. It is considerably harder to formalize than chess. It goes well beyond what search engines like Google do when they comb data for keywords. Google can give you 300,000 page matches for a search of the terms “greyhound,” “origin” and “African country,” which you can then comb through at your leisure to find what you need.
As I discussed in the post yesterday, Watson uses a massively parallel approach, on both the hardware and software levels, to try to come up with an answer. The system decides whether to “buzz in” to answer based on a statistical measure of confidence that its answer is right. Mr. Powers says that some people may discount this as a sort of gimmick.
This raises the question of whether Watson is really answering questions at all or is just noticing statistical correlations in vast amounts of data.
As he goes on to suggest, the question that is really raised is. what do we mean by really answering questions? I don’t believe any of us is introspective enough to determine how our decision making process works, certainly not down to the “hardware” level. Similar sorts of objections have been raised about the Turing test, proposed by the English mathematician Alan Turing as an operational test of a machine’s intelligence. I have always found these objections to be vague in the extreme; if intelligence in playing Jeopardy! or the Turing test has some quality that transcends the ability to answer convincingly, I have never seen that quality described.
Regardless of whether Watson is a total flop, or wipes the floor with Ken and Brad, I think the project that built it can teach us a good deal about the problem of interpreting natural language.