Researchers: Digital Data Drives Storage Explosion

University of California Berkeley researchers report that the amount of new information stored on paper, film, optical and magnetic media has doubled in the last three years to five exabytes — or 5 million terabytes.

The researchers, supported by tech giants Microsoft, Intel, HP and EMC, said the amount of new information produced in those forms last year alone was the equivalent of 500,000 libraries, each containing a digital version of the print collections of the Library of Congress.

Forrester senior storage analyst Anders Lofgren told TechNewsWorld that the IT industry is dealing with the information onslaught through hardware and software, but is still struggling to keep all of that data manageable.

“In general, the headache is that storage continues to grow — and although hardware prices continue to decline, what doesn’t is the cost of managing that storage,” Lofgren said.

Data Channels

Indicating that worldwide production of information has increased 30 percent per year from 1999 to 2002, the UC Berkeley researchers told attendees at an information storage industry conference in Orlando, Florida, that most of the new information comes in the form of office documents and e-mail as opposed to books, newspapers and journals.

Researchers discovered that 92 percent of new information is stored on magnetic media — primarily on hard drives — and peer-to-peer file sharing helped MP3 music and digital video account for 70 percent of the files on P2P users’ hard disks.

Research team leader Peter Lyman said the dropping cost of using magnetic hard drives and optical storage media such as CD-ROM and DVD has fueled the surge in retrieval and storage of files from the Internet. Lyman called the Web a utility that offers easy, steady access for institutions and individuals.

“The democratization of publishing is something that we thought would happen, and it has happened,” he said.

User Overload

The researchers — who used a sampling of nearly 10,000 Web sites and studied desktop disk drives, reports and other information — said the study illustrates the need for effective, reliable and cost-efficient data-storage strategies for consumers as well as corporations.

Meta Group vice president Steve Kleynhans told TechNewsWorld that users can no longer be forced to rely on their own memory to find data on the right server, file or other storage subdivision.

Kleynhans said data no longer centers on text documents and files, instead encompassing Web pages, digital video, pictures, music, Flash animation and more. He added that data is also unlikely to exist in one place going forward.

“In the future, files will exist in multiple places,” he said. “Locating them will be built into the content of that file. These are really important concepts when you’re dealing not with tens of thousands, but hundreds of thousands of pieces of information.”

Unstructured Territory

Despite the doubling of information stored on paper, film and hard drives, that increase paled in comparison with the amount of new information flowing electronically on radio, television and the Internet in 2002, which was nearly 18 exabytes — equivalent to 18 million terabytes.

Researchers also reported that the telephone accounts for the largest percentage of information flow, with e-mail placing second.

Forrester’s Lofgren said that while the IT industry has embraced the storage of structured data (such as documents and reports), the handling of unstructured data (such as phone calls and messages) remains a challenge.

“We’re seeing huge growth in unstructured data formats,” Lofgren said. “There’s a fairly significant opportunity there, and obviously there’s an opportunity to sell more hardware.

“There’s also the opportunity to provide the kind of storage capabilities and functionalities in structured data formats for unstructured formats as well.”

1 Comment

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Technewsworld Channels