The mushrooming growth of data generated by government agencies will drive federal IT investment in the necessary tools for managing colossal amounts of bits and bytes. As a result of the explosion in information, the government will be spending increasing amounts of its funds on Big Data capabilities.
Federal agencies spent about US$4.9 billion on Big Data resources in fiscal 2012, according to estimates from Deltek, an IT consultancy. The annual amount of such spending will grow to $5.7 billion in 2014 and then to $7.2 billion by 2017 with a compound annual growth rate of 8.2 percent.
“This will not necessarily be new money that is added to current IT budgets. Mainly it will be from redirecting normal annual appropriations to Big Data investments,” Alex Rossino, principal research analyst at Deltek, told the E-Commerce Times.
Currently, federal agencies are pursuing 155 identifiable Big Data projects involving procurements, grants or related activities, according to Deltek. The agencies with the most projects include the Department of Health and Human Services with 34 projects; the Defense Department, 31; the Department of Energy, 19; the National Science Foundation, 16; and the Department of Veterans Affairs, 12.
While Big Data is often associated with big federal agencies, even smaller units still have challenges in meeting data management goals. For example, the Federal Communications Commission, with just 2,000 employees, is one of the largest collectors of information through regulatory notices and currently has 400 different registered data collection initiatives.
“The general focus of Big Data is on how much federal agencies data have available, but another side of the issue is managing the collection of data as well, which can be just as complex,” said Greg Elin, chief data officer at the FCC and a panelist at a Big Data seminar hosted by Deltek on March 14.
How Big Is Big Data?
One of the first tasks in managing Big Data is scoping out exactly what is meant by the term. At the seminar, Rossino characterized Big Data as referring to data sets “so massive, varied, and rapidly accumulating that traditional analysis tools and computing resources cannot be used to yield timely analytical insight.” To manage such massive amounts of information, IT resources must be “scaled out” to handle analysis of the data, he said.
The volume of data stored by federal agencies will increase from a level of 1.6 petabytes currently to 2.6 petabytes within two years, Rossino said, noting that the estimates were conservative. Current traditional data analytics are functional at the terabyte processing capacity, one level below petabyte in the IT scale.
Yet even that capability will soon be insufficient. “What we are hearing from federal agencies is how to get to the exascale level from petascale, and that will drive a research demand in IT for entirely new modeling. Things are not going to be done in the same way in the future,” Mark Luker, associate director of the Networking and Information Technology Research and Development Program (NITRD), said at the seminar. NITRD is a federal program directed by the White House Office of Science and Technology Policy.
From a non-geek perspective, a terabyte is the equivalent to the printed pages derived from 50,000 trees. Each upward step in the scale is exponentially driven.
The Elusive Big Data Market
Vendors addressing the Big Data challenge should be aware that there is really no single, identifiable “market” in the traditional sense, Rossino noted. Instead, a variety of solutions will be required to deal with the challenge.
The integration of various tools will embrace storage, hardware, analytical software, and consulting services, as federal agencies have limited competency and personnel to pursue solutions, he added. In fact, agencies will not only need technical IT assistance, they will also need subject matter expertise in managing high volumes of data.
However, agencies will not have an easy time in implementing a Big Data strategy. Impediments include shaky budgets, a shortage of skilled personnel, and concerns about data quality and security. Agencies will also have to deal with planning and governance issues. Those can include moving from smaller scale applications to enterprise-wide strategies that encompass an entire Cabinet-level department or a complete agency environment.
A perennial issue regarding federal IT innovation that comes into play with Big Data is the procurement process. “I really can’t do much on various aspects of Big Data management if I can’t get something that I need, whether it’s hardware or software,” the FCC’s Elin said at the seminar.
“The current waterfall aspect of IT procurement generally inhibits flexibility for innovation. The emphasis should be on the end result of the procurement rather than the process,” Elin told the E-Commerce Times.
He was referring to the traditional waterfall engineering concept, in which a design requirement is set and is implemented in sequential steps to completion. The alternative for Big Data and other IT innovations would be the more flexible agile method, where projects are developed in a modular fashion so requirements can be changed more frequently compared to a static, pre-determined outcome.
A potential avenue for federal agencies to pursue in developing Big Data capabilities would be to incorporate a data management component into a cloud solution. For example, the savings provided by moving an existing resident data storage function to a cloud configuration could be used to cover the addition of a Big Data component to the cloud transition.
“Technically that would be a workable option for agencies,” Luker told the E-Commerce Times.
A Serious Federal Effort
The seriousness of the federal effort in Big Data was underscored by the Obama administration’s plan for promoting research and development efforts designed to generate data management tools for handling the ever increasing volume of federally generated data.
In March of last year, the administration launched a “Big Data Initiative” in which six major agencies committed themselves to supporting efforts covering a range of topics from health care to geo-spatial research and national security data. The Defense Department alone said it would spend as much as $250 million a year on Big Data projects.
At the Deltek seminar, Luker noted that one of the latest examples of the federal commitment to improve access to the huge store of government information was the recent White House directive to make research and scientific data generated by federal agencies — including the results of scientists receiving federal grants — much more accessible to the public. The February 22, 2013 directive covers peer-reviewed publications and digital data.
The result of the directive, said John Holdren, director of the White House Office of Science and Technology, “will accelerate scientific breakthroughs and innovation, promote entrepreneurship, and enhance economic growth and job creation.”