Sorting Out the Big Data Myths
"When I think about Big Data, it's really a trend that has happened as a result of digitizing so much more of the information that we all have already and that we all produce. ... It's added a lot more data to our universe, but the real opportunity is to look for small elements of small datasets and look for combinations and patterns within the data."
Debunking myths around Big Data should be a first step to making better business decisions for improving data analysis and data management capabilities in a company.
As the volume and purpose of data and business intelligence has dramatically shifted, older notions and misconceptions -- what amount to myths about data infrastructure -- need to updated and corrected, too.
Posing better questions about data can yield better answers for running data-driven businesses that can efficiently and repeatedly predict dynamic market trends and customer wants in real time.
As the volume and types of data that are brought to bear on business analytics advance, the means to manage and exploit that sea of data need to be none too costly nor too complex for mid-size companies to master. There are better ways than employing traditional data architectures.
In this podcast, Darin Bartik, executive director of products in the Information Management Group at Dell Software, identifies what works best in modern Big Data management. The interview is conducted by Dana Gardner, principal analyst at Interarbor Solutions.
Listen to the podcast (43:24 minutes).
Following are some excerpts:
Dana Gardner: Are people losing sight of the business value by getting lost in speeds and feeds and technical jargon around Big Data? Is there some sort of a disconnect between the providers and consumers of Big Data?
Darin Bartik: You hit the nail on the head with the first question. We are experiencing a disconnect between the technical side of Big Data and the business value of Big Data, and that's happening because we're digging too deeply into the technology.
With a term like Big Data, or any one of the trends that the information technology industry talks about so much, we tend to think about the technical side of it. But with analytics, with the whole conversation around Big Data -- what we've been stressing with many of our customers -- is that it starts with a business discussion. It starts with the questions that you're trying to answer about the business; not the technology, the tools, or the architecture of solving those problems. It has to start with the business discussion.
That's a pretty big flip. The traditional approach to BI and reporting has been one of technology frameworks, and a lot of things that were owned more by the IT group. This is part of the reason why a lot of the BI projects of the past struggled, because there was a disconnect between the business goals and the IT methods.
So you're right. There has been a disconnect, and that's what I've been trying to talk a lot about with customers -- how to refocus on the business issues you need to think about, especially in the mid-market, where you maybe don't have as many resources at hand. It can be pretty confusing.
I've been a part of Dell Software since the acquisition of Quest Software. I was a part of that organization for close to 10 years. I've been in technology coming up on 20 years now. I spent a lot of time in enterprise resource planning, supply chain and monitoring, performance management, and infrastructure management, especially on the Microsoft side of the world.
Most recently, as part of Quest, I was running the database management area -- a business very well-known for its products around Oracle, especially Toad, as well as our SQL Server management capabilities. We leveraged that expertise when we started to evolve into BI and analytics.
I started working with Hadoop back in 2008-2009, when it was still very foreign to most people. When Dell acquired Quest, I came in and had the opportunity to take over the Products Group in the ever-expanding world of information management. We're part of the Dell Software Group, which is a big piece of the strategy for Dell overall, and I'm excited to be here.
Without disparaging the vendors like us, or anyone else, the current confusion is part of the problem of any hype cycle. Many people jumped on the bandwagon of Big Data. Just like everyone was talking cloud. Everyone was talking virtualization, Bring Your Own Device, and so forth.
Everyone jumps on these big trends. So it's very confusing for customers, because there are many different ways to come at the problem. This is why I keep bringing people back to staying focused on what the real opportunity is. It's a business opportunity, not a technical problem or a technical challenge that we start with.
Gardner: Even the name Big Data stirs up myths right from the get-go, with big being a very relative term. Should we only be concerned about this when we have more data than we can manage? What is the relative position of Big Data, and what are some of the myths around the size issue?
Bartik: That's the perfect one to start with. The first word in the definition is actually part of the problem. Big. What does big mean? Is there a certain threshold of petabytes that you have to get to? Or, if you're dealing with petabytes, is it not a problem until you get to exabytes?
It's not a size issue. When I think about Big Data, it's really a trend that has happened as a result of digitizing so much more of the information that we all have already and that we all produce. Machine data, sensor data, all the social media activities and mobile devices are all contributing to the proliferation of data.
It's added a lot more data to our universe, but the real opportunity is to look for small elements of small datasets and look for combinations and patterns within the data that help answer those business questions that I was referencing earlier.
It's not necessarily a scale issue. What is a scale issue is when you get into some of the more complicated analytical processes and you need a certain data volume to make it statistically relevant. But what customers first want to think about is the business problems that they have. Then they have to think about the datasets that they need in order to address those problems.