Google Will Forget You Asked
Mar 15, 2007 1:34 PM PT
Internet search giant Google announced Wednesday it will take steps to improve the way it handles data obtained as millions of consumers search for products and information online.
"We will continue to keep server log data (so that we can improve Google's services and protect them from security and other abuses) -- but will make this data much more anonymous, so that it can no longer be identified with individual users," Peter Fleischer, privacy counsel-Europe, and Nicole Wong, deputy general counsel, posted on Google's company blog.
Google users logged in some 3.3 billion search queries during January alone, and on average process about 91 million search queries each day, according to comScore, a digital tracking firm. Rather than hitting the delete button automatically, Google collects all the information on each search, such as keyword queries, IP addresses, and cookies -- digital trackers that log a user's online habits -- and stores them in enormous server farms located all over the world.
Previously, the company has kept that data for "as long as it was useful." However, once the policy change is enacted, that information will be scrubbed to remove or change bits of data in the IP address and cookies so that it can no longer be associated with a specific computer or user.
"Changing the bits of an IP address makes it less likely that the IP address can be associated with a specific computer or user. Cookie anonymization makes it less likely that a cookie can be used to identify a user," the company explained.
The policy change applies to any information the company is not legally required to retain for a longer period of time. Google estimates that it will take its engineers about one year to resolve the policy's technical details and that it will provide additional information as it becomes available.
"We're still developing the precise technical methods and approach to this, but we believe these changes will be a significant addition to protecting user privacy," Google stated. "We'll communicate more as we work out these details, but for now, we wanted you to know that we're working on this additional step to strengthen your privacy."
Ghost of Subpoenas Past
The decision to alter its data log retention policy came in the wake of feedback from "numerous privacy stakeholders," including regulators, privacy advocates, consumer protection groups and users around the world, according to Google.
The policy switch could also be a response to Google's tussle in 2006 with the U.S. Department of Justice, John Barrett, research director at Parks Associates, told TechNewsWorld. Google refused to comply with a government subpoena demanding random data on some Web searches and Web sites it indexes. Google Cofounder and President, Sergey Brin said that it was their "obligation to use the law to the farthest possible means to protect our users' privacy."
Yahoo, Microsoft and AOL had complied with their subpoenas and provided the requested information. Google opposed the request not because it was viewed as a violation of privacy, but because the company said the request was too broad and overreaching. In the end, a federal judge ordered the government to limit its request to only 50,000 URLs.
"Google has quite a bit of information out there, and they know that if users aren't comfortable using the services they provide, they will stop using it, and obviously that is not in their interest," Barrett explained.
"The data information has a certain amount of value to it, but that is counterbalanced with the privacy concerns that could be raised," Barrett added. "Google is trying to keep its users happy here and make sure they still feel comfortable using their service."
The policy change was incremental and something that privacy advocates have been calling for for some time, Andrew Frank, a Gartner analyst, told TechNewsWorld. "There's pretty unanimous contention that this is a step in the right direction," he stated.
"There is a significant portion of the [privacy] community that would like to see them hold information for no longer than it is necessary to complete a transaction," he added.
Google's shorter data retention time was a great first step, but more is needed to protect user's privacy, Rebecca Jeschke, a spokesperson for digital rights advocacy group the Electronic Frontier Foundation, told TechNewsWorld.
"That should be six months or even just 30 days," she stated. "There's no reason that it needs those records for two years."
If Google is going to keep personally identifiable information for up to two years, then it needs to give users a better idea of how it uses the information, Jeschke argued.
While the amount of time Google keeps the data is important, securing the information against misuse and greater transparency in how the information is used is more important, according to Frank.
"I'd like to see a lot of companies, including Google, that are collecting a lot of personally identifiable information, or even gray-area personally identifiable information, to submit to much more independent kinds of auditing activities that would give us better assurances than the self-policing norms we have today," he explained.