Powerset, a new Internet search engine company, has launched a proof-of-concept salvo in the search engine war — a natural language processing engine that attempts to understand the meaning behind a search phrase and the text it’s searching against. Instead of returning a long list of results that simply contain the key words a user enters, Powerset attempts to unlock the meaning encoded in ordinary human language.
Understanding that meaning is a difficult task — any parent with a teenager knows how easy it is to speak different languages, never mind what happens when a MySpace page enters the picture. The inherent difficulty of understanding a shifting language used over unstructured Web pages is one reason why Powerset is different from a universal search engine like Google, for instance. Right now, Powerset only searches against Wikipedia and Freebase, which provide a reasonably consistent method for publishing information.
“Our first product has only touched the surface of what our technology will allow,” commented Lorenzo Thione, cofounder and product architect at Powerset. “Our team of computational linguists, computer scientists and engineers, together with the PARC technology we licensed, has allowed us to develop a solid platform to begin to change the way people consume content.”
Powerset shows promise for changing the way people search, according to Greg Sterling, an analyst for Sterling Market Intelligence.
“They are doing some interesting things on a number of levels. They’ve created a technology that does seem capable of understanding concepts in some larger context than simply keyword density, and that’s impressive,” he told TechNewsWorld.
“I also like some of the things they are doing with the interface — grouping related searches and structured content, in addition to sort of general results — also, the way you can preview content in a meaningful way,” he added.
For users who’ve rarely ventured beyond the search results presented from the likes of Google, Yahoo and MSN, the simple attempt to present information in a variety of ways can seem startling and intuitive. For instance, Powerset returns a section of information that it believes covers a search, followed by a list of “Factz” gleaned from Wikipedia. A search of “George Bush” offers three different tabs of results — the first two cover the two presidents of the United States by that name. The Factz then let a user simply click on additional terms to drill down into key information facts — or assertions appearing on Wikipedia that appear to be facts, at least. Some examples include which bills Bush signed, declarations of war, who he defeated and the like.
Users can simply link the primary Wikipedia pages that cover the basic topic, of course, but Powerset goes a bit further. It can provide information from a multitude of Wikipedia and Freebase pages.
“It’s interesting that it actually comprehends the words that it reads. That’s got potential, though it is probably overkill for the way we use search now,” Danny Sullivan, editor-in-chief of Search Engine Land, told TechNewsWorld.
Much of the power is determined by the scope of what’s available for the search. For instance, searching for “who killed Goliath” on Powerset brings up some sort of comic book reference that says Goliath was killed by a clone of Thor. Plus, there’s the distinction between a literal question — who did the killing? — and the figurative usage, which is the phrase that has little to do with the Bible story and much to do with the metaphor of a small guy taking on a big guy and winning.
Enter the same search term into a Google search, and the results offer a top 10 set of pages that cover whether it was David or Elhanan, depending on the Bible translation. A clone of Thor is nowhere to be found in Google’s first-page results. So which was the searcher looking for? Comic book Goliath or Bible Goliath?
As of now, Powerset doesn’t appear to be suitable for searching the entire Web any time soon, but it could expand to search other types of high-value Web sites with structured content. One obvious example is where Powerset could be set up to run against government databases, Sterling said.
Right now, much of the speculation surrounding Powerset’s next generation is if a larger company will gobble it up. Microsoft is widely recognized as a likely buyer, followed by Google and Yahoo, as well as an organization that could turn Powerset into an enterprise search offering focused on large businesses.
“It’s like a cool car at an auto show, a prototype car that you’re not sure is going to go into production — there’s some uncertainty if they are going to bring it to the mainstream,” Sterling explained.
“Someone is going to do something — it’s going to have a life — but what it’s going to be is still open,” he added.