It will use a system based on Question Answering, a computer science approach that tries to build software systems that can provide accurate, useful answers to questions people ask using natural language.
IBM researchers have been working for two years on the system, code-named “Watson.”
Not So Elementary, Watson
Watson will use semantics and massively parallel processing to understand complex questions, decide how confident it is in its answers, and provide links to supporting evidence.
“Being able to disambiguate meaning is at the core of this technology,” IBM researcher Eric Brown told TechNewsWorld. “The system must be able to decide, when you input the word ‘bank,’ whether you mean a financial institution or an aircraft turning or the bank of a river.”
IBM may use one of its Blue Gene supercomputers running Linux to participate in Jeopardy. Blue Gene supercomputers are used in higher education and government. (On a sweeter note, they’re also used by candy maker Mars in collaboration with the U.S. government to study the genetic code of cocoa trees in order to safeguard the world’s supply of chocolate.)
IBM will probably participate in “Jeopardy” sometime next year, according to Brown.
Still More On Technology
IBM is evaluating various data sources, including encyclopedias and dictionaries, to create a database for the “Jeopardy” challenge.
Its researchers are still wrestling with various technical problems as they prepare to tackle an appearance on “Jeopardy.”
“There are lots of data scale issues — how to partition the data to best support parallel processing and how to manage that so you avoid disk I/O (input/output) issues as much as possible,” Brown said.
The researchers are also trying to ensure that the raw data they collect is useful and accurate. “If all you have in your underlying data is garbage, that’s another problem, and it’s still a challenge we’re working on,” Brown said.
Getting Ready for Business
The technology being used in Watson could potentially be used by enterprises to better analyze their data.
Most importantly, it will enable machine-to-machine communication so businesses can directly query and analyze the raw data residing on their computers.
“We’re now looking at the computer as more than a calculating machine,” Rob Enderle, principal analyst at the Enderle Group, told TechNewsWorld.
“When you can analyze what lies under the numbers and make educated guesses at what the causes are, that’s a powerful tool. That capability is [critical] in order to get a response from an artificial intelligence system without having to go through a human,” he said.
What Is Truth?
The Question Answering technology at the heart of Watson is still in a state of flux. The language processing community has yet to develop a clearly articulated and commonly accepted grading framework and research methodology, according to the organizers of Open Advancement of QA (OAQA).
OAQA is a workshop launched jointly by IBM and several universities, including Carnegie Mellon University, in 2008.
The workshop will be part of the 21st International Joint Conference on Artificial Intelligence, to be held in Pasadena, Calif., in July.