Learning the Right Lessons from the Amazon Crash

Have you noticed that ZapThink’s crystal ball has been working overtime? We sounded the warnings about Cyberwarfare mere days before the Stuxnet worm hit. Then we predicted the fall of Enterprise Architecture frameworks right before the Zachman organization imploded. Next, we heralded a secondary market for IP addresses as the IPv4 space ran out of them. Sure enough, that secondary market is now here. And last week, we warned against putting all your eggs in any one Cloud provider’s basket. Sure enough, Amazon’s public cloud went belly up immediately afterward. All I can say is that if we make a prediction that will impact your business, you’d better take heed!

In all seriousness, there’s no supernatural clairvoyance at work here. What you’re seeing is the power of the ZapThink 2020 vision for Enterprise IT, which delineates the interrelationships among the numerous trends in the IT marketplace. Just as the best psychics are in reality masters at picking up subtle clues in human behavior, we’re tuning into the complex subtleties that the multiple forces of change in our midst present to us. One of the primary insights of the ZapThink 2020 vision is that individual trends, let alone single events, should never be taken in isolation. This insight is particularly useful when a crisis like the Amazon crash presents itself.

At this point in time, we’re experiencing a backlash from this crash. People are reconsidering the wisdom of moving to the Cloud, and in particular, public Clouds. Perhaps the large infrastructure vendors who were warning their customers about the security and reliability issues with public Clouds in order to sell more gear to build private Clouds were right after all?

Not so fast. If we place the Amazon crash into its proper context, we are in a better position to learn the right lessons from this crisis, rather than reacting out of fear to an event taken out of that context. Here, then, are some essential lessons we should take away from the crash:

  • There is no such thing as 100% reliability. In fact, there’s nothing 100% about any of IT—no code is 100% bug free, no system is 100% crashproof, and no security is 100% impenetrable. Just because Amazon came up snake eyes on this throw of the dice doesn’t mean that public Clouds are any less reliable than they were before the crisis. Whether investing in the stock market or building a high availability IT infrastructure, the best way to lower risk is to diversify. You got eggs? The more baskets the better.
  • This particular crisis is unlikely to happen ever again. We can safely assume that Amazon has some wicked smart Cloud experts, and that they had already built a Cloud architecture that could withstand most challenges. Suffice it to say, therefore, that the latest crisis had an unusual and complex set of causes. It also goes without saying that those experts are working feverishly to root out those causes, so that this particular set of circumstances won’t happen again.
  • The unknown unknowns are by definition inherently unpredictable. Even though the particular sequence of events that led to the current crisis is unlikely to happen again, the chance that other entirely unpredictable issues will arise in the future is relatively likely. But such issues might very well apply to private, hybrid, or community Clouds just as much as they might impact the public Cloud again. In other words, bailing on public Clouds to take refuge in the supposedly safer private Cloud arena is an exercise in futility.
  • The most important lesson for Amazon to learn is more about visibility than reliability. The weakest part of Amazon’s cloud offerings is the lack of visibility they provide their customers. This “never mind the man behind the curtain” attitude is part of how Amazon supports the Cloud abstraction I discussed in the previous ZapFlash. But now it’s working against them and their customers. For Amazon to build on its success, it must open the kimono a bit and provide its customers a level of management visibility into its internal infrastructure that it’s been uncomfortable delivering to this point.

The ZapThink Take

Abstractions hide complexity from consumers of technology, but if you do too good a job hiding the underlying complexity, then the abstraction can backfire. But that doesn’t mean that abstractions are bad; rather, you need different abstractions for different audiences.

The latest crisis impacted a wide swath of small Cloud-based vendors, from Foursquare to DigitalChalk to EDU 2.0. These firms’ customers simply wanted their tools to work, and were disappointed and inconvenienced when they stopped working. But the end-user customer may not have even been aware that Amazon’s cloud was behind their tool of choice. Clearly, those customers wouldn’t find better visibility into the Cloud particularly useful.

No, it’s the technology departments at the small vendors that require better visibility. They are the people who require management tools that enable them to gain a greater level of control over the Cloud environments they leverage in their own products. Once Amazon supports such management tools, then Amazon’s customers will be better able to provide the seamless abstraction to the Cloud end user, who simply wants stuff to work properly. And there’s nothing supernatural about that!