Azure, Google Talk, Twitter Outages Illustrate Data Downtime Issues

“Man, now I’ll have to do actual work instead of Tweeting.”

It was a day of outages for some of the biggest brands in data, with Twitter, Google, and Microsoft all experiencing some degree of downtime.

First, Microsoft’s Windows Azure service experienced an “availability issue” in its West Europe region, according to the Windows Azure Service Dashboard and subsequent news reports.

That situation seemed resolved by mid-morning EST, by which time a Google Talk outage was radically boosting workplace productivity across the nation. A posting on the Google Apps Status Dashboard stated that an undefined problem with the service was “affecting a majority of users.” A few hours later, an updated posting suggested that the problem had been “resolved,” while neglecting to identify a cause.

Then Twitter went down. “Users may be experiencing issues accessing Twitter,” read a note on Twitter’s Status Webpage. “Our engineers are currently working to resolve the issue.”

By early afternoon EST, Twitter seemed up and running, workplace productivity plunged again, and everything was seemingly right with the world.

Despite that happy ending, the rolling outages illustrated a particular problem facing IT vendors as they attempt to wrestle with the always-on nature of the cloud. Downtime is a fact of life for any cloud service, of course, often hitting companies such as Google and Amazon a few times a year.

But as more and more companies turn to cloud services for data analysis and productivity functions, the potential impact of downtime grows. “For every major enterprise, the ability to continue operations in the event of disruptive external events is critical,” Hiren Desai, a member of Cisco’s Advanced Services Product Management team, wrote in a July 9 posting on the official Cisco blog. “These organizations need to do more than demonstrate the ability to recover quickly from a disaster—they need to show that whatever the threat, they are able to continue doing business seamlessly.”

Desai’s solution, given his position, is for companies to sign onto services that offer IT infrastructure flexibility and resiliency. But for companies relying on public services for their massive data needs, the solution may simply boil down to crossing fingers, knocking on wood, and hoping for the best—most service-level agreements between vendors and companies offer guidelines with regard to downtime. Outages such as the ones today can compel prudent companies to double-check those agreements.


Image: Ronald Sumners/