Bringing Order to Data Backup Chaos
Enterprise backup is broken, but there are methods being developed to fix it.
Nowadays, methods for backing up and protecting enterprise data are fragmented, complex and inefficient. However, some new approaches are helping to simplify the process, keep costs in check, and improve recovery speed and confidence.
How did data protection became such a mess? What new techniques are helping to gain comprehensive and standard control over the data lifecycle?
For answers to these questions, and more, listen to a podcast featuring John Maxwell, vice president of product management for data protection at Quest Software -- now part of Dell, and George Crump, founder and lead analyst at Storage Switzerland, an analyst firm focused on the storage market. The chat is moderated by Dana Gardner.
Download the podcast (36:07 minutes) or use the player:
Here are some excerpts:
Dana Gardner: Why has something seemingly as straightforward as backup become so fragmented and disorganized?
John Maxwell: Dana, I think it's a perfect storm, to use an overused cliche. If you look back 20 years ago, we had heterogeneous environments, but they were much simpler. There were NetWare and Unix, and there was this new thing called Windows. Virtualization didn't even really exist. We backed up data to tape, and a lot of data was in terabytes, not petabytes.
Flash forward to 2012, and there's more heterogeneity than ever. You have stalwart databases like Microsoft SQL Server and Oracle, but then you have new apps being built on MySQL. You now have virtualization, and, in fact, we're at the point this year where we're surpassing the 50 percent mark on the number of servers worldwide that are virtualized.
Now we're even starting to see people running multiple hypervisors, so it's not even just one virtualization platform anymore, either. So the environment has gotten bigger, much bigger than we ever thought it could or would. We have numerous customers today that have data measured in petabytes, and we have a lot more applications to deal with.
And last, but not least, we now have more data that's deemed mission critical, and by mission critical, I mean data that has to be recovered in less than an hour. Surveys 10 years ago showed that in a typical IT environment, 10 percent of the data was mission critical. Today, surveys show that it's 50 percent and more.
Gardner: George, did John leave anything out? From your perspective, why is it different now?
Crump: A couple of things.I would dovetail into what he just mentioned about mission criticality. There are definitely more platforms, and that's a challenge, but the expectation of the user is just higher. The term I use for it is IT is getting "Facebooked."
I've had many IT guys say to me, "One of the common responses I get from my users is, 'My Facebook account is never down.'" So there is this really high expectation on availability, returning data, and things of that nature that probably isn't really fair, but it's reality.
One of the reasons that more data is getting classified as mission critical is just that the expectation that everything will be around forever is much higher.
The other thing that we forget sometimes is that the backup process, especially a network backup -- probably unlike any other -- stresses every single component in the infrastructure. You're pulling data off of a local storage device on a server, it's going through that server CPU and memory, it's going down a network card, down a network cable, to a switch, to another card, into some sort of storage device, be it disk or tape.
So there are 15 things that happen in a backup and all 15 things have to go flawlessly. If one thing is broken, the backup fails, and, of course, it's the IT guy's fault. It's just a complex environment, and I don't know of another process that pushes on all aspects of the environment in one fell swoop like backup does.
Gardner: So the stakes are higher, the expectations are higher, the scale and volume and heterogeneity are all increased. What does this mean, John, for those that are tasked with managing this, or trying to get a handle on it as a process, rather than a technology-by-technology approach?
Maxwell: There are two issues here. One, you expect today's storage administrator, or sysadmin, to be a database administrator (DBA), a VMware administrator, a Unix sysadmin, and a Windows admin. That's a lot of responsibility, but that's the fact.
A lot of people think that they are going to have as deep level of knowledge on how to recover a Windows server as they would an Oracle database. That's just not the case, and it's the same thing from a product perspective, from a technology perspective.
Is there really such thing as a backup product, the Swiss Army knife, that does the best of everything? Probably not, because being the best of everything means different things to different accounts. It means one thing for the small to medium-size business (SMB), and it could mean something altogether different for the enterprise.
We've now gotten into a situation where we have the typical IT environment using multiple backup products that, in most cases, have nothing in common. They have a lot of hands in the pot trying to manage data protection and restore data, and it has become a tangled mess.
Gardner: Before we dive a little bit deeper into some of these major areas, I'd like to just visit another issue that's very top of mind for many organizations, and that's security, compliance, and business continuity types of issues, risk mitigation issues. George Crump, how important is that to consider, when you look at taking more of a comprehensive or a holistic view of this backup and data-protection issue?
Crump: It's a really critical issue, and there are two ramifications. Probably the one that strikes fear in the heart of every CEO on the planet is all the disclosure laws that exist now that say that, when you lose a customer's data, you have to let him know. Unfortunately, probably the only effective way to do that is to let everybody know.
I'm sure everybody listening to this podcast has gotten more than one letter already this year saying their Social Security number has been exposed, things like that. I can think of three or four I've already gotten this year.
So there is the downside of legally having to admit you made a mistake, and then there is the legal requirements of retaining information in case of a lawsuit. The traditional thing was that if I got a discovery motion filed against me, I needed to be able to pull this information back, and that was one motivator. But the bigger motivator is having to disclose that we did lose data.
And there's a new one coming in. We're hearing about big data, analytics, and things like that. All of that is based on being able to access old information in some form, pull it back from something, and be able to analyze it.
That is leading many, many organizations to not delete anything. If you don't delete anything, how do you store it? A disk-only type of solution forever, as an example, is a pretty expensive solution. I know disk has gotten a lot cheaper, but forever, that's a really long time to keep the lights on, so to speak.