AWS Architecture Blog
Tag: Fault Isolation
A Case Study in Global Fault Isolation
In a previous blog post, we talked about using shuffle sharding to get magical fault isolation. Today, we’ll examine a specific use case that Route 53 employs and one of the interesting tradeoffs we decided to make as part of our sharding. Then, we’ll discuss how you can employ some of these concepts in your […]
Organizing Software Deployments to Match Failure Conditions
Deploying new software into production will always carry some amount of risk, and failed deployments (e.g., software bugs, misconfigurations, etc.) will occasionally occur. As a service owner, the goal is to try and reduce the number of these incidents and to limit customer impact when they do occur. One method to reduce potential impact is […]
Shuffle Sharding: Massive and Magical Fault Isolation
A standard deck of cards has 52 different playing cards and 2 jokers. If we shuffle a deck thoroughly and deal out a four card hand, there are over 300,000 different hands. Another way to say the same thing is that if we put the cards back, shuffle and deal again, the odds are worse […]
AWS and Compartmentalization
Practically every experienced driver has suffered a flat tire. It’s a real nuisance, you pull over, empty the trunk to get out your spare wheel, jack up the car and replace the puncture before driving yourself to a nearby repair shop. For a car that’s ok, we can tolerate the occasional nuisance, and as drivers […]