How secure and dependable is the Internet? The Great Twitter Outage of 2009, which shocked the microblogging community and amused many other observers, called into question the reliability of Web-based communications and transaction capabilities that are easy to take for granted.
According to Nielsen NetRatings, the Twitter user base grew at almost 1,400 percent from February 2008 to February 2009. Microblogging, online auctions and email may be a convenience for some users, but others view these features as critical to their connected lifestyles. Is the Internet falling apart, or can we depend on the Web to be there when we need it?
The Twitter outages in August surprised many Internet observers with their persistence — repeated failures over several days — as well as the size and speed of the shock waves sent around the world. Denied their ability to tweet, users turned to blogs, online forums and other channels to vent their frustration. Web and TV outlets were clogged with commentary and speculation by reporters and pundits, many of them Twitter users themselves.
The uproar only increased when word got out that the cause of the outage was a distributed denial of service (DDoS) attack with political motives. When the dust settled and tweets began flowing again, the big question remained: Will it happen again? To answer that question, it is helpful to look back at other Web service outages of similar scale and review their causes.
Professional Courage Required
In July 2006, the MySpace social networking site went off the air for more than 11 hours just two weeks after being named the top Internet property in the U.S. by Internet tracking firm Hitwise. Myspace.com had recorded an astounding 4,300 percent increase in visits over the previous two years, surpassing even search giant Google. In 2005, News Corp. paid US$580 billion to acquire MySpace and its personalized home page service.
No doubt, MySpace was a hot Internet property — but in the summer of 2006, the southern California weather was even hotter. Record-breaking temperatures and rolling power blackouts were roasting Los Angeles, the home of MySpace server farms. A power outage hit the MySpace data center and backup power generators failed, leaving over 65 million users in the dark.
How could a half-billion dollar Internet property fall over so easily? Backup power systems must be exercised routinely in order to ensure they will be available when they are needed. It takes a certain amount of professional courage to run backup power tests on a huge server farm.
MySpace CTO Aber Whitcomb detailed the MySpace infrastructure at a Microsoft conference only months before the outage. Myspace.com was running nearly 2,700 Web servers and 650 database servers to handle visitor traffic.
Periodically turning off commercial power to be sure uninterruptible power supplies, high-speed switchgear and the diesel generators that power them all function properly is the only way to be sure the current will keep flowing during a real power outage. Also, it is important to host critical Web services in geographically diverse data centers to keep all the eggs out of a single basket.
Outages, Design Flaws, Sabotage
Power outages are obviously a concern to Web service providers — but a poorly designed Web application can bring a site down just as quickly, and take much longer to troubleshoot and repair. Online auction powerhouse eBay was notorious for application issues in the late 1990s.
Proper multi-tier design of Web applications allows for flexibility in adding capacity as traffic increases and gives the operations staff greater control over load distribution during scheduled maintenance or when other infrastructure issues arise. It is very common for Web applications to start out with monolithic designs that are easy to set up but difficult to grow.
Meg Whitman had been CEO of eBay for only one month when a 22-hour outage hit the auction site in June of 1998. Feeling the direct hit on revenue, Whitman pulled together database and storage vendors, along with network and application engineers, and anyone else who could help figure out the cause of the outage.
Outages continued to dog the popular site over the next year while the company re-architected its application and infrastructure to increase reliability and add flexibility. eBay’s early-mover advantage in the online auction space and its large population of loyal users allowed it to survive those bumps in the road.
Infrastructure and application issues are preventable problems that repeatedly occur at high-growth Internet startups. However, the Twitter outage this year was different: It was sabotage.
When an application or service is purposefully and maliciously made unavailable, it is termed a Denial of Service (DoS) attack. Using the power of the Internet to generate large amounts of traffic from multiple locations is a Distributed Denial of Service attack. DDoS attacks have received a lot of attention recently but are nothing new.
Electronic Pearl Harbor?
In February 2000, the Yahoo Web portal was knocked off the air for hours by a DoS attack. A 15-year-old high school student in Montreal had carefully planned the attack and arranged to launch it automatically while he was at school.
Michael Calce, also known online as “Mafiaboy,” had worked for many months to collect a posse of compromised servers at universities and corporations that had large amounts of bandwidth available. The combined power of the network of “zombie” machines was enough to bring Yahoo to its knees.
Surprised by the success of his attack on Yahoo and flush with excitement over the power at his command, Calce followed up with subsequent attacks on other well-known Internet sites including Amazon, E*Trade and CNN.
At the time, it seemed that the Internet itself was under attack by a large, shadowy army that could appear and disappear as quickly as it wished. Was this the “electronic Pearl Harbor” that computer experts had been predicting? President Clinton pulled together a meeting of top industry executives to discuss the attacks and plan ways to increase the resiliency of Internet services.
Fast-forward to the Twitter outages this summer, and we find a DoS attack targeting a service provider with insufficient infrastructure to withstand the strike. Twitter is hosted at a single service provider rather than multiple providers in multiple data centers. This severely limited Twitter’s ability to respond to a coordinated onslaught that turned the power of the Internet against the popular service.
Similar attacks on Google and Facebook were not able to knock the services offline. Internet applications and the technology that supports them have come a long way in the past 15 years. Managed hosting providers, cloud-based service providers like Amazon EC2 and sophisticated data replication features in database management systems all give today’s Web developers a rich set of tools for building high-availability services.
One thing remains constant: The phenomenal rate at which new users can flock to Web destinations has hit the big time. Properly managing capacity against aggressive growth curves requires a proactive investment in people, technology and security infrastructures that few investors are willing to make since the burst of the Internet bubble at the turn of the last century.
Ray Dickenson is CTO of Authentium, a provider of security software solutions.