Tech's Hard-Boiled Progeny: The Data Journalist
The cigar-chomping reporter in the saggy brown suit with holes in his shoes and a nose for news is a stereotype that doesn't have much of a counterpart in today's real world. There's a new breed of investigative reporter in town: the geek who knows how to extract raw data from public sources, crunch the numbers, and spew out compelling analyses -- often with startling visuals to match.
04/16/13 5:00 AM PT
When we think of traditional news gatherers, we might conjure up the image of an obstreperous character brazenly hassling a slimy official for the real story -- or hovering paparazzi harassing a poor celebrity innocently shopping for handbags in Beverly Hills.
However, there have been some technology-driven changes since Hollywood handed us those stock characters that could severely alter the picture. Today, your jaded, hard-drinking, courtroom-loitering newsman could just as well be represented on the silver-screen by an enthusiastic bookworm-like computer geek. This is thanks to a new form of reporting known as "data journalism."
Data journalism crept into modern media through the back door. You may not even have noticed.
What It Is
A byproduct of all computing is data -- lots of it. Lots of pretty-much useless numbers sitting around in computers. This data is often tabulated systematically in database form, however, and it's often publicly available -- paid for in many cases by tax dollars.
Even if it's not neatly stored in databases, it's often regurgitated into paper-based or PDF reports that can be accessed, sometimes through Freedom of Information Act requests. The data can then be cleaned, scanned, and converted into organized databases. Then it can be sorted into usable intelligence about a particular subject, limited only by the analyst's -- or data journalist's -- creativity.
A good example of this kind of lurking data is crime numbers -- public statistics provided by local police departments. The data is collected as part of the police force's daily operations, and it's published for anyone who bothers to look at it. It's pretty drab stuff -- numbers up, numbers down -- until a data journalist gets hold of it.
A data journalist will collect those numbers over time -- often delivered by the agency weekly or monthly, along with geographic coordinates like addresses. That lets the data journalist generate maps, visualizations, reports and adjectives for the neighborhoods within the agency's jurisdiction, thus letting readers keep current on local crime.
The data journalist can also drill down on behalf of the reader, interpreting whether violent crime or property crime has risen in a particular area, for example. The Los Angeles Times produces mapping like this.
That's it. That's the basic concept.
"Data journalism is a fast-developing field that has transformed investigative reporting across newsrooms for decades in the U.S. and more recently in the UK," said Minal Patel of City University London's Center for Investigative Journalism.
Patel organizes sellout boot camps and advanced statistics courses for reporters at the Center, which has played a pioneering role in bringing data journalism training to the UK by bringing over specialists like David Donald, data editor at the Center for Public Integrity in Washington, D.C.
The power of data journalism, or "computer assisted reporting" as it is called in the U.S., is its ability to allow journalists to interrogate vast data sets to find public interest stories that would otherwise go unreported," Patel told TechNewsWorld.
One major proponent of data journalism is UK daily newspaper The Guardian, which frequently creates data visualizations -- often maps -- relating to current news. Recent examples published include maps of Chicago's gun murders with a view to trying to discover whether there's any relationship between the physical location of gun shops outside the city and killings within the city limits.
Data journalism has become a good source of stories and Web traffic for The Guardian, Simon Rogers, editor of the paper's Datablog, told TechNewsWorld.
"It also precisely fits with our mission to provide open journalism," he said. "By analyzing the data, showing our workings, and publishing it for all to use, we are showing that open journalism in action -- and it works for us on quite a small team based in The Guardian's newsroom."
Other notable recent Guardian-published visualizations include how the world's 7,000 nuclear warheads are deployed globally; gun ownership by country; and mapping the home addresses of suspects in the 2011 UK riots with economic indicators in an effort to understand the role poverty may have played.
The Data Journalism Handbook
Journalists can get a sense of how to get started with their own projects by consulting the Data Journalism Handbook.
"As just one indicator of how fast this area is growing, in just the last few weeks we've seen job ads for data journalists from organizations as diverse as The Times in London, Le Temps in Geneva, and Al Jazeera and PBS in Washington," said the Open Knowledge Foundation's Jonathan Gray, who works an editor of the handbook.
More and more organizations are interested in how they can use data more effectively to pursue their goals, he told TechNewsWorld, and that includes media organizations and investigative journalism networks.