As you probably know, Hurricane Sandy is causing devastation across the Eastern Coast of North America. As a techy person in the UK, the severity of the situation hit home for me when one of my favourite web apps Trello went offline.
Initially I wasn’t sure if this was related, so I dug into it a little bit, and found out that they host with Peer1 in their New York data centre. Peer1 are pretty reputable in hosting terms, so hearing that they had manually shut down one of their data centres was pretty big news for me.
Most data centres have backup generators to accommodate loss of power. At 14:40 on 29/10/2012 they switched power supply to their backup generators due to mains electricity supply being affected.
By 19:30 that evening, they were still running on their generators without problems. The basement of the building they are in began to flood, but they estimated they had 12-24 hours of fuel available (some 5000 gallons of diesel). By 05:00 the following morning, they estimated 4 hours of fuel remained. By 06:30 they begin a controlled shutdown of the whole datacentre. Thousands of websites which host there began to go offline, including Trello.
It turned out that the issue they faced was that even though the diesel generators are on the 17th floor of the building, the pumps that pump fuel up to them were located on a floor that was subject to flooding, and were submerged. Not being able to run the pumps to get fuel up the generators meant that they had a finite amount of diesel fuel left.
In situations like that, people club together. Groups of people began carrying 55 gallon drums of diesel up 17 flights of stairs to power the generators and they managed to get the data centre back online within a few hours.
For me, this was one of the first major experiences of the cloud going down. There have been instances in the past, but I was less connected to and less aware of them. Indeed as single data centre is not the cloud, and the point about idea of the cloud is that you’re not dependent on any single data centre so that events like this do not have any impact on service or data. However, it’s amazing today how many people still rely on a single data centre and don’t necessarily consider it a risk to do so.
I believe that Trello have now moved to a true cloud solution where their data is spread across multiple data centres so that they’re not vulnerable to this type of situation again.
From my point of view, it’s amazing how something on the other side of the world can have immediate impact on people across the globe, such is the interconnected world that we live in today. My thoughts are with all those people affected, including a good friend currently holed up in a New York hotel room, unable to get a flight out.
Some photos of the aftermath from the hurricane via The Atlantic.
Read updates from the Trello team here.