Organizations of all types are increasingly turning to business intelligence (BI) applications — a combination of data mining, statistical, and numerical analysis and data visualization — to better manage and make sense of the ever-growing amounts of data they are collecting.
Continuing the mantra of Part 1 of this two-part series, BI solutions are attractive to firms that want to make better, more informed decisions based on those mountains of data.
Given the need to mine data sets that often contain granular levels of detail and may be spread across disparate database and other systems, analysts, managers and executives are increasingly looking at data visualization tools and methods as a means to more effectively represent data and improve their ability to identify key variables and interrelationships.
Concurrently, BI systems providers such as Advizor Solutions are turning to efficient data and memory management tools — such as in-memory data management — in order to enable this.
In-Memory Data Management
An essential and distinguishing feature of BI systems is their ability to manage and manipulate the vast amounts of data stored in data warehouses. The Advizor Solutions BI system, for example, approaches this problem by making use of in-memory data management.
“Click stream data expands dramatically the size of each respondent’s record. Our samples tend to run into the thousands or even tens of thousands,” Usability Sciences’ CEO Jeff Schueler said.
“This would make true ad hoc investigation of the data very slow on a regular desktop or laptop, but Advizor reduces the data to flat files, which allows for in-memory processing and workable response time for the analyst. Without these flat files, we would likely need computers with far greater processing power which, of course, cost a lot more,” he added.
“Reporting and dashboard applications had been successful in providing consistent visibility into operations and timely performance monitoring, but were not forward looking. The ability to more successfully plot the future course of business has increased demand for predictive analytics in BI implementations,” claimed Steve Trammell, head of business intelligence alliances at ESRI.
“Customer requests for a more complete operational picture and the ability to be more proactive have led to the combination of these two technologies. Regulatory requirements have also raised the visibility of both technologies within many organizations,” he added.
BI Goes to College
BI applications and predictive analytics can be put to good use in universities, according to Advizor Solutions CEO Doug Cogswell. “Universities have rich databases with all kinds of information on their students. Management is continually trying to understand this information to make better decisions. This data is particularly wide — often 150 or more columns for each student.
“Most people can visually explore 10 to 20 dimensions. It is almost impossible to visually interpret 150 — some kind of mathematics is required to cut the focus list down to something reasonable. This is the core of our IP (information processing),” Cogswell said.
Cogswell used a typical query university administrators might pose: “What is the profile of students who leave after their first year?” as an illustration of how Advizor works. The specified data set — students who left after their sophomore year — is loaded from an Oracle database, or a Cognos or Business Objects reporting system, into Advizor’ in-memory data pool.”
Users can then build analytics dashboard pages with various charts as a first attempt at understanding the composition of the data set.
“Is it complete? Where are there holes? What seems to make a difference?” are the types of questions typically asked at this stage of the inquiry process, Cogswell explained.
Mining Data Subsets
Subsets of intuitively and statistically significant data can be selected and explored further by pointing and clicking on graphical objects; say, the part of a graph representing last year’s sophomore class. In the same manner, the query can then be refined by drilling down and zooming in on those students who left the school.
In order to build and assess a statistical model, analysts can make use of Advizor’s mining wizard, which can set a target field to a visually selected population. The model returns “a delineation of which factors (columns) explain what percentage of this selected group,” according to Cogswell.
Each factor is then broken into positive influence and negative influence categories.
“For example, maybe family income came out as a key factor, and ‘less than $75,000’ was a strong influence on leaving; ‘$75,000 to $125,000’ was neutral and ‘more than $125,000’ was a negative influence on leaving,” Cogswell explained.
“This mining input provides guidance for which 8 to 10 dimensions to examine visually. The model can also be saved, and then run against this year’s incoming class to score who is at risk for not making it through, and the information can be used to establish early on corrective action,” he said.
It’s the efficient integration of the system’s data mining, visualization charts and in-memory data pool that imbue the Advizor system with fast, intuitive ad hoc querying and data presentation capabilities.
“This means the mining is integrated to the charts through the data pool,” Cogswell continued. “A selection on a chart will go to the mining, and output from the mining models will be reflected in the charts. We are not aware of anybody else who has integrated these three technologies (predictive mining, data visualization and in-memory-data-management) into one application.”
Advizor’s BI tools enable Usability Science analysts to graphically display and manipulate visual objects — what they call “buckets” — of behavioral data as easily as they manipulate attitudinal and metrics data, Usability Sciences’ Schueler continued.
“Since all three data sets are integrated, we segment the data from any perspective, and those selections are propagated across each other ‘view’ into the data,” he said.
Form and Explore
“The value of this to an analyst is immense,” he continued. “Ad hoc data analysis is all about the forming and investigation of hypotheses. The Advizor tools allow our analysts to form and explore — validate or reject — more hypotheses in a given amount of time than would be possible using other BI tools.”
The system’s interactive, graphically-driven features include linked coloring, selection and filtering. A set of predictive statistical and numerical analytic methods complement these “visual discovery” features.
“We say four things here: display, interaction, sharing and authoring,” Cogswell elaborated. “The key aspects of display are having a range of views to capture all common business requirements. We would argue that our 15 chart objects represent the widest range on the market. It is critical that charts cover detail (actual rows and columns), summary (tabular, bar chart, pie chart), hierarchy (heatmap, constellation), geographical (map), statistical (parabox, box plot), comparison (scatterplot, timetable), etc.”
Sharing and extending the range of the results of models for analytical, predictive and forecasting purposes, filtered data can be exported to other application and database systems, including Web applications, Excel spreadsheets, Oracle databases, reporting tools, CRM engines or other systems.
“It gets a bit more complicated if there are multiple linked tables, especially if there are one-to-many relationships across the tables, but the idea is the same,” Cogswell concluded.