Bringing Order Out of Chaos in the Digital Universe
The underlying value of data is a central tenet of virtually every IT solution, but the ease, speed, accuracy and efficiency of accessing information are central determinants of its value. Displayed in a high security case at the Smithsonian, the Hope Diamond is invaluable. Lost at the bottom of the Mariana Trench under 6+ miles of Pacific Ocean, it's a worthless, albeit pretty, chunk of highly compressed carbon.
When the first EMC-sponsored study aimed at determining the size, scope and implications of digital information growth appeared five years ago, its IDC and University of California, Berkeley, authors employed restraint with the title, "The Expanding Digital Universe." In the follow-up study two years later, "Expanding" morphed into "Diverse and Exploding." The newest iteration, published last month, stretches the guiding concept even further with the notion of "Extracting Order from Chaos."
Why the escalating headline rhetoric? Quite simply because the forecasts in the original study (which seemed somewhat wild-eyed then) appear quaintly understated today. In 2006, IDC and UC researchers posited that the total amount of digital information annually created/replicated by and for businesses and consumers would follow historic data storage growth trends, doubling in size every 18 months or so to expand from 161 to 988 exabytes ([EB] 1 EB = 1 billion gigabytes [GB]) by the end of 2010.
But the new study suggests those initial projections were off considerably, concluding that the total volume of digital information created and replicated in 2011 will be 1.8 zettabytes (1 ZB = 1 trillion GB) or nearly twice as much as the estimates for 2010.
In other words, the digital universe is expanding and will likely continue to expand at a significantly faster rate than originally assumed. But that's not all; the new study's researchers also state that in the coming decade, digital information growth will increase by 50X while the number of data files or containers that data inhabits will grow even faster -- by 75X -- due to the rapid uptake of embedded system and digital sensor technologies.
Wrap Your Head Around This
Those are big numbers by any measure, but what do they mean practically? That's a point digital universe researchers are pretty good at conveying. In terms of sheer volume, following are some equivalents to 1.8 zettabytes of data:
- Every person in the United States tweeting 3 tweets per minute nonstop for 26,976 years
- Every person in the world having over 215 million high-resolution MRI scans per day
- More than 200 billion HD movies (each two hours in length), which would take one person 47 million years to watch if viewing them 24x7
- The amount of information needed to fill 57.5 billion 32 GB Apple iPads, a number that could be used to accomplish these feats:
- Create a wall of iPads, 4,005-miles long and 61-feet high extending from Anchorage, Alaska to Miami, Florida
- Build a Great iPad Wall of China -- at the same length but twice the average height of the original
- Build a 20-foot high wall around South America
- Pile up into a mountain 25-times higher than Mt. Fuji
The Great iPad Wall of China is an image that would likely cause Apple sales and marketing executives to wet their pants, but it carries profound implications for the billions of people and millions of organizations creating, storing and managing information.
Why so? Because even though the underlying value of data is a central tenet of virtually every IT solution and service, the ease, speed, accuracy and efficiency of accessing information are central determinants of its value.
Displayed in a high security case at the Smithsonian, the Hope Diamond is, by most measures, invaluable. Lost at the bottom of the Mariana Trench under 6+ miles of Pacific Ocean, it's a worthless, albeit pretty, chunk of highly compressed carbon.
The Cost of Chaos
So how does the study suggest digital universe data creators and owners extract "value from chaos"? That depends on whether one is talking about consumers or businesses. Historically, consumers approach data storage and management in a highly piecemeal fashion. Part of this is due to the ongoing evolution of technology products, with capacity and performance marching ever upward while prices continually fall.
A decade ago, 1 GB of hard disk drive (HDD) storage cost around US$10.00. Today, a thousand times more storage (a 1 TB HDD) costs about $50. The result? Consumers buy HDDs and other media as they need or purchase more storage at the same time they upgrade PCs and laptops. This is not an effective path to information management, but it continues despite countless evocative tales of lost data, stolen laptops, hard drive failures and similar unrecoverable disasters.
Technologically literate users seem to be moving toward cloud storage solutions in increasing numbers, taking advantage of offerings ranging from automated services like EMC's Mozy and Box.net to free online storage provided by Google and Amazon. However, the vast majority of consumers continue to deal with data disasters passively, learning unnecessarily hard lessons after the fact.
While painful, such stories would carry little import beyond the individuals and families affected except for two things: 1) while consumers create 75 percent of the information in the digital universe, enterprises are responsible for 80 percent of data at some point in its lifecycle, as the study points out; and 2) the role of "digital shadows" (data created about individuals by private and public sector entities) in the digital universe is rapidly expanding, the researchers noted. These heighten the importance of effectively managing informational chaos for organizations.
How to Extract Value From Chaos
Businesses are hardly immune from storage crises, but they can also impose and enforce strict guidelines and policies regarding data management, backup and recovery. More importantly, the new study recommends they take additional steps to enhance information value:
- Use data capture, search, discovery and analysis tools to garner new or additional insight
- Leverage content management to help decide exactly what to store
- Improve system performance and data center energy efficiency via de-duplication, auto-tiering and virtualization technologies
- Enhance IT staff efficiency via automation and other storage management solutions
- Identify and protect information against specific threats by leveraging security devices/software, fraud management systems and reputation protection services
The study also suggests that cloud computing environments -- both public and private and in hybrid combinations -- provide enterprises with new levels of economies of scale, agility and flexibility, compared to traditional IT environments. As a result, they will play key roles in managing the growing complexity and size of organizations' mushrooming information assets.
Final ThoughtsSo how does this new digital universe study measure up against both previous iterations and IT reality as we know it in 2011? Pretty well, overall. As has been the case in exploring the physical universe, mapping the digital equivalent requires an accretive approach and process whose value cannot be overstated.
The study's design seems generally solid. That some of its conclusions have changed dramatically over time is a reflection, at least in part, of the fact that technology today is often a radically different market and world than it was in 2006.
I would quibble about some of the report's conclusions -- particularly its focus on the expanding role of public cloud computing. While I agree with the idea in theory, concerns about information security continue to inhibit many businesses from adopting public cloud services and solutions. However, that is likely to change over time, as cloud vendors and service providers develop more flexible, robust and secure solutions, and clients continue to struggle with crushingly expanding volumes of information.
The study also pays little time or attention to some critical related issues, such as how inhabitants of the digital universe will cope with long-term information archiving. Consider that during the three decades or so of the personal computing revolution common storage media have changed radically, leaving behind once-ubiquitous technologies including 5.25" and 3.5" floppy disks, Zip disks, Jaz disks and Laserdisks.
Today, tape media is being relegated to narrower and narrower niche archive applications, and many in the industry believe that even HDDs will eventually be supplanted by solid state disk (SSD) and flash technologies. How businesses and individuals will cope with these known and other unknown developments is anyone's guess. Perhaps they may be addressed in some future digital universe study.
During the 2008 financial crisis, at a time when sales of many consumer and business IT solutions were in decline, data storage continued to thrive. Much of this success can be reasonably attributed to the continually enhanced price/performance of storage solutions. But a common truth in IT and other service-related markets is that a challenge for one person can become an opportunity for others.
Careful study of current and previous EMC-sponsored digital universe efforts suggests that significant challenges and remarkable opportunities will likely continue, so long as people and organizations seek the benefits of information technology.