By Susan B. Shor TechNewsWorld
08/08/05 1:22 PM PT
The ability to hunt through corporate data that is not stored within easily searched databases, called unstructured data, is becoming more and more important as employees communicate and conduct business through e-mail, word processing, Excel and PowerPoint.
IBM (NYSE: IBM) Research will turn over its data search technology to the open source community, the company said today. The Unstructured Information Management Architecture (UIMA) searches store data not through keywords, but by analyzing the data within documents to see if they fit the concepts and facts the user is researching.
It will be made available through SourceForge, a repository for open-source code, by the end of the year, IBM said.
Which Rock?
Nelson Mattos, IBM distinguished engineer and vice president strategy, WebSphere Information Integration Solutions, used the example of the word "rock," which can mean a stone, a type of music or to move back and forth. Searching for the keyword "rock" will yield documents with all those definitions, but the UIMA search will be able to sort out the irrelevant data.
The ability to hunt through corporate data that is not stored within easily searched databases, called unstructured data, is becoming more and more important as employees communicate and conduct business through e-mail, word processing, Excel and PowerPoint.
"Employees spend about one-third of their time looking for relevant information to get their job done," Mattos told TechNewsWorld. "Eight-five percent of data stored in corporate repositories today is unstructured. Only 15 percent is things you can represent as rows and columns and it is that 15 percent that companies use business intelligence to analyze."
Many Practical Uses
Gathering and analyzing the vast majority of business data can drastically change how companies relate to their clients, because, for instance, they will be able to extract and analyze call center information much more quickly, Mattos said.
The technology has applications beyond enterprises. For example, government agencies could search through all available data, and medical researchers might be able to aggregate information on patients and/or medications and spot patterns earlier.
UIMA, which took four years from concept to inception, is incorporated into IBM's WebSphere Information Integrator Omnifind Edition, WebSphere Portal Server and Lotus Workplace. IBM also has the support of Attensity, ClearForest, Cognos (Nasdaq: COGN), Endeca, Factiva, Kana, Inquira, iPhrase, Inxight, nStein, QL2, SAS, Schemalogic, Semagix, SPSS (Nasdaq: SPSS) and Temis, making UIMA a standard framework for searching and analyzing unstructured data.
"The framework will have broad applicability once you have companies building applications on it," Mattos said about the decision to open source. Google (Nasdaq: GOOG), Microsoft (Nasdaq: MSFT) and Yahoo (Nasdaq: YHOO) -- the major search engine competitors -- all offer a desktop search feature, but they are driven by keywords. However, the potential is there, with UIMA being open-sourced, that any one of these companies could take the framework and build new search strategies onto it.
Thanks for the interesting article. Once again IBM is giving us a great vision about the future ...
Next Article in Operating Systems
GroundWork CEO Unveils Open-Source Vision August 04, 2005
"As open-source software moves from the domain of operating systems, databases and middleware to applications, we and other open-source developers are pioneering a development approach that focuses on constructing enterprise-ready open-source applications," said Ranga Rangachari, CEO, GroundWork Open Source Solutions.
Related Stories
Yahoo Launches Beta to Search for Music, Audio August 04, 2005
Although Yahoo's audio search service might seem like an opportunity for it to divert traffic to its own music offerings -- and away from those of competitors such as iTunes -- users actually have the option of designating a preferred audio service, such as iTunes or Yahoo Music.
Aggregator Dogpile Says Not All Search Alike August 02, 2005
The study, which was conducted during July, found that of the first-page results from 12,570 searches on the top four search engines, just 1.1 percent were shared by all four sites. About 11 percent of results were shared by at least two search engines, but nearly 85 percent of results were unique to one of the four engines.
AOL Testing Mobile Search Services July 27, 2005
Jupiter Research analyst Julie Ask told TechNewsWorld that AOL's WAP version makes it unique in the marketplace. WAP, for Wireless Application Protocol, is an open international standard for applications that use wireless communication, such as Internet access from a mobile phone.
Related News Alerts
More by Susan B. Shor
Salesnet President Jonathan Tang Ready to Take On Salesforce.com February 07, 2006
"We think it's Salesnet's time now. We've been around since the beginning, we've been lying low, but you're going to start to see more of us. We've done it through organic growth and happy customers. We continue to focus on customers."
Comcast Follows Time Warner in Offering 'Family' Programming Tier December 23, 2005
"The demand for this type of tier is coming from the FCC and Christian conservatives. It has nothing to do with legitimate consumer demand," Todd Chanko, senior analyst at Jupiter Media, told the E-Commerce Times.
High-Risk Flaw Found in Symantec's Software December 22, 2005
"Part of the significance of this vulnerability announcement is that your machine can be exploited without you needing to do anything at all. You don't even have to open an e-mail or attachment, and this happens with the default configuration of the product," said Forrester Research senior analyst Michael Gavin.