Drenched by the Cloud (Amazon downtime)

A startup company, victimized by Amazon EC2’s failure, tells its tale. It’s not pretty.

By Sandra Gittlen
May 09, 2011 05:24 am | CFOworld
Share

Thursday, April 21, is a day that Michael Downing, the CEO and CFO of social media start-up Tout, won’t soon forget. In the wee hours of the morning, Downing learned a harsh lesson: cloud computing is not bulletproof.

Tout, which had launched its real-time video status update service a week and a half earlier, was among the numerous customers taken down by Amazon’s EC2 outage. Not only was the main database, which houses critical account information, impacted, but Downing also quickly learned that the company’s application server partner, Heroku, also was an Amazon customer -- and offline. “The first 90 days is the critical time when you’re trying to establish your brand and you build momentum. That wasn’t possible when our systems were at a complete standstill,” Downing says.

Before this incident, Downing was proud that more than 90% of his applications were being hosted in the cloud so the company could get off the ground without the shackles of high infrastructure costs. “I’ve trusted and used cloud services for years and this technology is transformational for the start-up world,” he says.
Broken Trust

That trust is now irrevocably broken, he says. While Heroku came back online relatively quickly, his database remained down for almost 48 hours. At some point, after little communication from Amazon about a fix, Downing and his team uploaded a three-day-old snapshot of the database to a server at another Amazon location – far from the ailing Virginia data center. “Although we permanently lost some data, we were at least able to get back online,” he says.

As much as a week after the incident began, Downing says that Amazon still hadn’t been in touch with him to explain the outage that we now know stemmed from a configuration error, other than generic, mass messages. “Part of the whole value proposition when you sign on for these services is there will be no one single point of failure and even if a whole node goes down, your systems won’t be tanked. This was a huge eye opener that proved that is definitely not the case,” he says.

As this story was being published, Amazon hadn't responded to a request for comment.

read rest of story

Resources

Recent Assets

  • purple-car.png
  • IGEL_3rdPartyDatabase_sm.jpg
  • hp-t620.png
  • screencap016(526 x 702).jpg
  • Top100Logo2013.png
  • DieterTolksdorf2_web.jpg
  • hp-portfolio.png
  • mt41 (2).png
  • mt41 (1).png
  • IGEL_Gebaeude_small.jpg

About this Entry

This page contains a single entry by Staff published on May 9, 2011 4:45 PM.

Cloud IT, Corporate IT and Amazon Crash was the previous entry in this blog.

Mobile Thin Client - new 14" X Class from Wyse is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Categories

Monthly Archives