Rethinking Storage for the Private Cloud
Recent events have added further credence (if more was needed) to the growing body of evidence that -- despite the boundless hype -- cloud computing is a game-changing phenomenon that is having an impact on how people think about enterprise storage.
VMWorld 2010 brought an estimated 17,000 people together -- unheard of numbers for a trade show in recent years -- for a conference dominated by cloud computing, with storage playing a surprisingly important role in the cloud discussion. The theme of the show -- "Virtual Roads. Actual Clouds" -- served as an extension of EMC World's "Journey to the Private Cloud" last spring.
However, it isn't just EMC and its close relative VMware, that are talking cloud -- and particularly, the private cloud. Both HP and Oracle announced bundled cloud solutions focused on jump-starting the implementation of clouds within the enterprise.
On the storage front, the battle between Dell and HP over 3Par culminated in HP's whopping US$2.4 billion purchase of a storage company touted by pundits as particularly well-suited for deployment in cloud environments.
While there are a number of factors influencing their behavior, these vendors are ultimately being driven by a steady, growing interest in the cloud within IT organizations. For an increasing number of these organizations, the vision of the cloud approach is compelling. This interest has caused many organizations to develop strategies for moving to a private cloud.
From a storage perspective, what does this move to the cloud mean? How does it impact architecture, management and selection of storage technology? What are the key attributes required for storage in a private cloud environment?
To answer these questions, it's helpful to re-examine what makes the cloud, well, a cloud. While there are more definitions of "cloud" -- or wannabe definitions -- than we could possibly mention in this article, there are some relatively common attributes that make up the cloud. For purposes of this discussion, we will focus on key aspects of the private cloud and will therefore sidestep the public vs. private debate.
First, the cloud implies virtualization. I'm not at all suggesting that cloud equals virtualization, but at the core of the cloud concept is the dynamic pooling and sharing of resources. This directly translates to a need for virtualization as a core component.
Other attributes of a cloud include location independence -- the idea that an application running in the cloud has the ability (even if only potentially) to run in different locations. This has implications in terms of how a "cloud" application might be designed and how its data might be stored.
Another important notion of the cloud is automation -- being able to relocate applications and quickly provision or deprovision resources without relying on a set of manual steps performed by an army of administrators.
There is one other key attribute of the cloud that is perhaps the most transformative of all: The cloud requires a services-based approach to IT. This means rethinking delivery and management of IT infrastructure -- including storage -- and transforming from a technology-centric, project-driven mindset to a service-centric methodology that focuses on service levels based on business need, resource planning, and operational process maturity.
Cloud Infrastructure Capabilities
If we consider the implications for storage, it means that the ideal storage infrastructure for a private cloud must have several capabilities:
- Be able to be quickly provisioned (and reclaimed) -- Agility has become a watchword for the cloud. For storage, this translates into the ability to provide it whenever, or wherever, it's needed. Equally important is the ability to reclaim and repurpose existing storage, a nearly impossible task with traditional storage arrays.
- Readily support a full portfolio of service profiles -- One size fits all does not work in most cases. Instead, the underlying storage will likely need to handle a variety of performance, availability and recovery profiles based on application workload and criticality of data.
- Be highly manageable -- The traditional siloed administrative model that requires long lead times for provisioning and change is contrary to the cloud model. The ability to allocate and manage storage from a common management console, even if on a restricted basis, is key to streamlining process and providing agility.
- Provide options for transparent data relocation -- The cloud ultimately means that data center boundaries are giving way. While it may still be a far reach today for many environments, application relocation across geographies is becoming more feasible. This suggests a large scale data replication capability that is best accomplished at the storage level with application and host awareness.
- Be highly robust -- As the underlying data repository for numerous cloud-based applications, storage availability becomes a critical factor; an outage could have far reaching consequences.
- Flexible and scalable -- The ability to quickly expand capacity to meet demand without a complex set of integration and allocation processes -- and without downtime -- is critical for a production cloud.
Cloud Storage CharacteristicsFrom a technology perspective, these requirements translate into a list of features that can be found, to varying degrees, in the more recent product offerings of a number of storage vendors.
Specifically, storage platforms best able to support a private cloud environment would offer the following:
Pooled disks -- Although the traditional storage model of bound raidsets with hard-partitioned LUNs has served us well since the early 90s, that approach is not ideal for the cloud era. To allocate and deallocate storage for the cloud, a pooled model -- either block-based with virtual LUNs or file-based via NFS -- is a quicker and more flexible approach. This virtualized approach to storage also enables important efficiency benefits like thin provisioning and deduplication. In addition, the ability to create large disk pools means that data can be striped across a wide number of disks, resulting in a substantial performance boost.
Auto-tiering -- Supporting multiple performance profiles in an agile and flexible manner cries out for something other than the dedicated, static tiered-storage approach that became the norm at the height of the ILM era. For environments with high IO needs, particularly where solid state technology may be beneficial, the best option may be auto-tiering, where leveraging SSDs at the sub-LUN level to accommodate the hottest blocks promises high value.
Integration into virtualization management frameworks -- Any private cloud is likely to be built around a virtualization platform, and the management console for this platform will likely be the focal point for administering the environment. Storage should not have to be managed as a separate entity from the whole. Therefore, integration of storage administration into the management console for the virtualization platform is important. Realizing this is, of course, a two-way street between virtualization and storage vendors.
Integrated replication and snapshotting -- Replication and snapshot functionality are commonplace in storage systems today. In the cloud, it becomes a required component for both resiliency and location independence. Like other core functions, this should be transparently integrated with the server/virtualization platform.
Modularity, high bandwidth, and scale-out capabilities -- While these are three different attributes, they all contribute significantly to the flexibility and scalability of the cloud. Flexible growth is enhanced expansion enabled by modular devices (as opposed to monolithic frames). The increased bandwidth afforded by technologies like 10Gb Ethernet (and beyond) has a major impact on supporting growth across a range of storage protocols, including iSCSI, NFS and eventually FCoE. Finally, the ability to improve aggregate connectivity through scale-out architectures ensures cloud expandability.
Before the Cloud
The number of platforms that possess many or most of these characteristics is increasing.
Additional factors, such as specific platform support and aggregate connectivity requirements, will likely tilt the selection process in favor of a short list of vendors.
The capabilities discussed here were introduced long before talk of the "cloud" and have been maturing over several years. The onset of the cloud has boosted their importance and provided a context for adoption in the evolving IT paradigm of the next decade.
James Damoulakis is CTO at GlassHouse Technologies.