Well it’s been an eventful few weeks here in the Linux blogosphere, what with all the various scandals that have erupted recently over Digg and Ubuntu’s off-and-on romance with Dell, to name just two.
Then there was Debian’s birthday on Monday! For some heartwarming Linuxy love, you won’t want to miss the Debian Appreciation Day page, which received almost 3,000 messages marking the event. Happy 17th, Debian!
By far the hottest topic in recent days, however, was news that Canonical has begun tracking Ubuntu installations.
No User-Specific Data
It’s true! The new “canonical-census” package apparently sends an “I am alive” ping to Canonical each day as a way to help the company track the users of OEM Ubuntu installations, according to a report on Phoronix.
No user-specific data is sent, Phoronix notes; rather, the package reportedly transmits only the operating system version, the machine product name and a counter.
“This information will obviously be valuable to see whether customers are keeping around their Ubuntu installations or just wiping them and just how often Ubuntu is being used on these systems,” Phoronix’s Michael Larabel pointed out.
Plus, “for those not wanting to participate in this anonymous data gathering process, they could always sudo apt-get remove canonical-census,” he added.
‘There Should Be an Outcry’
The story, not surprisingly, made quite a splash on Slashdot, where more than 500 comments followed in short order.
“I would probably let the feature stay enabled,” wrote Zumbs, for instance. “But I do want easy access (no code digging) to see what information is being collected, who gets access to it and an easy way of turning the feature off. And I would consider it a courtesy if Canonical actually asked me.”
Similarly, “it seems like any kind of Linux usage statistics you see these days are just a load of hot air,” noted unixcrab. “Hopefully this will provide some solid data and hopefully Canonical will make it public.”
On the other hand, “why should the details of the article negate the fact that this is a privacy issue, and there should be an outcry about it? Does the fact that it’s only happening against a subset of installs matter?” asked Richard_at_work. “Not really. Does the fact that there is an *opt-out* option? Again, not really, as it’s tracking usage — this should be opt-in for definite.”
Given the importance of usage statistics in Linux’s ongoing desktop battle, Linux Girl vowed to learn more.
‘Should Be Turned Off by Default’
“It’s good if the public has access to the data,” Hyperlogos blogger Martin Espinoza opined. “Otherwise it’s just spyware, even if it’s allegedly not collecting personally identifiable information.”
Similarly, “it’s one thing to ask nicely — people don’t mind returning a favor,” began Barbara Hudson, a blogger on Slashdot who goes by “Tom” on the site. “It’s another thing to track someone without asking.”
Tracking without permission is illegal in several countries, Hudson told Linux Girl. “For example, criminal court decisions in Canada went against the police when it was held that IP addresses are ‘personal identifiable information,’ and you need either permission or a warrant to collect them.”
A user accessing a Web server is consenting, “because the server needs the IP address to know where to send the files to,” she pointed out. “That’s quite a big difference from your computer sending its IP address to someone every day, possibly without you even being aware of it.”
At the very least, “this is something that should be turned off by default,” Hudson concluded.
‘This May Be Part of an Exit Strategy’
Slashdot blogger hairyfeet had a surprising theory for the motivations behind Canonical’s move.
“This may be part of an exit strategy on the part of Canonical,” he suggested.
“They are losing serious money on Ubuntu, and as we saw with Mandriva, it is nearly impossible to make enough money from a Linux desktop to pay for R&D, much less grow, thanks to the DIY nature of Linux users,” he explained. “So there is a good chance that Canonical is collecting this data to see where their money has gone and to decide whether to stay in the desktop game.”
‘We Need Hard Numbers’
Others, however, saw it differently.
“There are hundreds of conflicting estimates of how many Linux machines are out there,” Montreal consultant and Slashdot blogger Gerhard Mack pointed out. “Knowing how many people are using Linux is important because larger numbers provide leverage when we go asking for drivers from hardware manufacturers, apps from software makers or try and convince OEMs to bundle Linux.”
Currently, “Microsoft has the advantage because they just count every machine sold with a Windows license as one for their team, so we need hard numbers to back our side of the story,” Mack added.
Indeed, such data “could be very useful to defeat FUD and to encourage OEMs to put more effort into selling GNU/Linux,” blogger Robert Pogson agreed.
“I would expect that boxes running GNU/Linux should be kept running longer by their owners because they work well and reliably, even on older machines,” Pogson suggested. “That other OS is replaced periodically because it fails and owners keep hoping the next release will be better, even if the old machines have to be scrapped.”
A few months of statistics, then, “should be able to demonstrate that owners keep Ubuntu running,” he opined. “A few years of statistics should be able to demonstrate the longevity of GNU/Linux systems.”
‘An Open Standard for Reporting’
Pogson would even go so far as to encourage every distro “to join an open standard for reporting the existence of GNU/Linux systems,” in fact.
“It would require some kind of authentication certificate cranked out by installers, because just a ping would be easily falsified,” he noted. “It cannot be just an install-time thing, because one can install repeatedly on a single system. Any system like WGdisA would likely be unacceptable.
“Do we need a FLOSS organization that issues authentication certificates?” he asked. “Is the kernel the right place to put code that would define GNU/Linux systems? What if a firewall were put up to block the pings?”
Whichever the case, it “would require consensus to implement,” he concluded.
The need for hard numbers shouldn’t over ride disclosure.
the standard GNU with the agree box checked by default would be a usual sort of compromise.