AWS Fault Injection Service (FIS) is a fully managed fault injection service that makes it easier for teams to discover an application’s weaknesses at scale in order to improve performance, observability, and resilience. You can find a list of supported fault injections here.
Simple setup
AWS Fault Injection Service makes it easy to get started building and running fault injection experiments, without needing to install any agents. Fully managed fault injection actions are used to define actions such as stopping an instance, throttling an API, and failing over a database. Fault Injection Service supports Amazon CloudWatch so that you can use your existing metrics to monitor Fault Injection Service experiments.
Run real-world scenarios
Scenarios define events or conditions that you can apply to test the resilience of your applications, such as an AZ power interruption or cross-region connectivity interruption. Scenarios are created and owned by AWS, and minimize undifferentiated heavy lifting by providing you with pre-defined targets and fault actions (e.g., gradually increase CPU load from 90% to 100% for Amazon EC2instances) for possible application impairments.
Scenarios are provided through the Scenario Library in the FIS console, and are run using an FIS experiment template. In order to run an experiment using a scenario, simply select the scenario from the library, copy it to your experiment template, and specify your application details. Each scenario includes a detailed description and suggested metrics to measure the response of your application during the experiment, helping you improve the resilience posture of your applications over time. You can find a list of supported scenarios here.
Fine grained safety controls
When running experiments in live environments, there’s a risk of unintended impact. To provide guardrails and keep your fault injection experiments under control, AWS Fault Injection Service allows you to target based on environments, application, and other dimensions using tags. For example, you could increase CPU utilization on 10% of your instances with the tag “environment”:“prod”. Fault Injection Service also has the option to set rules based on Amazon CloudWatch Alarms or other tools to stop an experiment. For example, an experiment can be set to stop before completion if a web page response time decreases below an acceptable level.
Integrated security model
AWS Fault Injection Service is integrated with AWS Identity and Access Management (IAM) so that you can control which users and resources have permission to access and run Fault Injection Service experiments, and which resources and services can be affected.
Visibility throughout an experiment
AWS Fault Injection Service provides visibility throughout every stage of an experiment via the console and APIs. As an experiment is running you can observe what actions have executed. After an experiment has completed you can see details on what actions were run, if stop conditions were triggered, how metrics compared to your expected steady state, and more. To support accurate operational metrics and effective troubleshooting, you can also identify what resources and APIs are affected by a Fault Injection Service experiment.
Console and programmatic access
You can use AWS Fault Injection Service with the AWS Management Console, AWS CLI, and AWS SDKs. The Fault Injection Service APIs allow you to programmatically access the service so that you can integrate fault injection testing into your continuous integration and continuous delivery (or CI/CD) pipeline, and custom tooling.
Visit the AWS Fault Injection Service Pricing page.
Get started building with AWS Fault Injection Service in the AWS Management Console.