Twitter Outage: Blame the Data Center

Twitter blamed a “double whammy” within its data center for an outage that took its entire service down for about 40 minutes on July 26.

“Twitter is currently down for <%= reason %>,” the site’s error message read, beginning about 8:20 AM PST that day. “We expect to be back in <%= deadline %>. For more information, check out Twitter Status. Thanks for your patience!”

The outage lasted from between 8:20 AM until 9:00 AM PST, before returning to full functionality around 10:25 AM PST.

The reason for the outage? What Twitter vice president of engineering Mazen Rawashdeh referred to as a “double whammy.”

“The cause of today’s outage came from within our data centers,” he wrote in a corporate blog posting July 26. “Data centers are designed to be redundant: when one system fails (as everything does at one time or another), a parallel system takes over. What was noteworthy about today’s outage was the coincidental failure of two parallel systems at nearly the same time.”

Rawashdeh then apologized for the event. “I wish I could say that today’s outage could be explained by the Olympics or even a cascading bug,” he wrote. “Instead, it was due to this infrastructural double-whammy. We are investing aggressively in our systems to avoid this situation in the future.”

Twitter’s Tarnished History in the Data Center

In 2010, Twitter disclosed that it would build a data center in Salt Lake City, Utah. “Having dedicated data centers will give us more capacity to accommodate this growth in users and activity on Twitter,” Jean-Paul Cozzatti, the vice president of engineering at Twitter, wrote at the time, adding that the company would bring additional Twitter-managed data centers online within the next 24 months.

It’s unclear, however, if Twitter ever did so. In April 2011, Reuters reported that the company had signed a four-year, $27 million lease with C7 Data Centers to construct the Utah facility. That was before problems emerged with the facility, ranging from insufficient power to a leaky roof that threatened to soak vital equipment; Twitter apparently either abandoned the project or tried to break the contract.

Instead, according to Reuters, Twitter moved its equipment to Raging Wire, a facility hundreds of miles away in Sacramento, Calif. But space there was tight, and Twitter had to halt feature development because it was unable to procure the server infrastructure to support it.

Cozzatti, now vice president of engineering at Rally, a fundraising tool, did not respond to requests for comment via Twitter. Employees reached at Raging Wire declined to confirm or deny that Twitter was a customer, either now or in the past.

What appears to be the case, however, is that the total site outage means that Twitter’s entire operations are housed in one or two data centers, apparently dependent upon one another. Rawashdeh, the Twitter vice president, also didn’t respond to requests for comment via Twitter.


Image: Twitter