Tech leaders representing Google, Wikipedia and the Center for Democracy and Technology testified Tuesday before the Senate Committee on Homeland Security and Government Affairs, which is mulling reauthorization of theE-Government Act. Signed into law five years ago, it requires government agencies make information more accessible electronically.
Many of the Act’s goals have not been met, primarily due to the technical difficulty in retrieving government information through a general search. Committee Chairman Joe Lieberman wants to reauthorize E-Gov, pushing federal agencies to upgrade their search capability and privacy policies.
Suggestions from such tech luminaries as Wikipedia founder Jimmy Wales and Google’s James Needham, who manages the search giant’s partnerships with governments, included greater use of wikis and other collaboration tools on federal Web sites, and the adoption of an industry standard that would make indexing government information easier.
Difficult to Find
Both suggestions would address the main barrier to making data available to search engines, which are seemingly locked out of government Web sites that are available to individual users.
“So much of the information that is intended to be used by the public is in a dynamic database that requires the users to drill down looking for information,” Greg Jarboe, spokesperson for Sempo (Search Engine Marketing Professional Organization), told TechNewsWorld.
Essentially, this makes it difficult for a search engine spider to index the site as a unique page. “It requires a human to input what he or she is looking for,” he explained.
Though Sitemap — the standard urged by Needham — was developed by Google over the past year, it has the support of all the major search engines, Jarboe said. “There are no competing protocols.”
Essentially, Sitemap allows an agency to identify which pages can be crawled by a search engine.
“Pages you don’t want crawled you don’t submit to Sitemap,” continued Jarboe.
It is relatively easy to protect sensitive information that might impact national security or individual’s privacy, he noted. “After all, there are a lot of commercial sites that don’t want their credit card information crawled; that type of information is fairly straightforward to protect.”
Even if the feds decide to adopt the suggested tools, few expect a wealth of government data to suddenly show up on a Google search.
For starters, government agencies move at their own speed, Jarboe observed, which typically doesn’t match the deployment pace of their profit-minded counterparts.
Also, all government agencies — even those at the municipal level — would need to be involved, noted Metatomix cofounder and CTO Colin Britton. Metatomix provides search technology to help public sector organizations connect disparate information sources on criminal offenders at the local, county and state level.
More standards need to be developed, he told TechNewsWorld. “The ability to understand the context and content of the information — and then tag it appropriately — is key. Once you have that, then you can start to apply rules and privacy policies.”
The fear of violating privacy and national security laws, though, will likely be the bete noire of any such initiatives, said Andrew B. Serwin, a partner with Foley & Lardner’s intellectual property litigation, and information technology and outsourcing groups.
“The big issue around what will essentially be the creation of a massive electronic database of information is that there is a higher chance of potential misuse,” Serwin told TechNewsWorld.
“What’s missing in this discussion, however, is the need for enhanced security by federal agencies, which are the stewards of the public’s information,” Kevin Richards, U.S. federal government relations manager at Symantec, told TechNewsWorld.
“Today there’s a real challenge for public agencies to determine to what extent information ought to be available,” Richards pointed out.