AWS Fault Injection Service Documentation

AWS Fault Injection Service (FIS) is a fault injection service that makes it easier for teams to discover an application’s weaknesses in order to improve performance, observability, and resiliency.

Setup

AWS Fault Injection Service makes it easier to build and run fault injection experiments, without needing to install agents. Fault injection actions are used to define actions such as stopping an instance, throttling an API, and failing over a database. Fault Injection Service is designed to support Amazon CloudWatch so that you can use your existing metrics to monitor Fault Injection Service experiments.

Run real-world scenarios

Scenarios define events or conditions that you can apply to test the resilience of your applications, such as an AZ power interruption or cross-region connectivity interruption. Scenarios are created and owned by AWS, and minimize undifferentiated heavy lifting by providing you with pre-defined targets and fault actions (e.g., gradually increase CPU load from 90% to 100% for Amazon EC2 instances) for possible application impairments.

Scenarios are provided through the Scenario Library in the FIS console, and are run using an FIS experiment template. In order to run an experiment using a scenario, select the scenario from the library, copy it to your experiment template, and specify your application details. Each scenario includes a detailed description and suggested metrics to measure the response of your application during the experiment, helping you improve the resilience posture of your applications over time.

Safety controls

AWS Fault Injection Service is designed to help you target experiments, based on environments, application, and other dimensions using tags, which provide guardrails and help keep your fault injection experiments under control. Fault Injection Service also has the option to set rules based on Amazon CloudWatch Alarms or other tools to stop an experiment.

Security model

AWS Fault Injection Service is integrated with AWS Identity and Access Management (IAM), which helps you control which users and resources have permission to access and run Fault Injection Simulator experiments, and which resources and services can be affected.

Visibility throughout an experiment

AWS Fault Injection Service is designed to provide visibility throughout the stages of an experiment via the console and APIs. As an experiment is running, AWS Fault Injection Service helps you to observe what actions have executed. After an experiment has completed you can see details on what actions were run, if stop conditions were triggered, how metrics compared to your expected steady state, and more. To support accurate operational metrics and effective troubleshooting, you can also identify what resources and APIs are affected by a Fault Injection Service experiment.

Console and programmatic access

You can use AWS Fault Injection Service with the AWS Management Console, AWS CLI, and AWS SDKs. The Fault Injection Service APIs allow you to access the service so that you can integrate fault injection testing into your continuous integration and continuous delivery (or CI/CD) pipeline, and custom tooling.

Additional Information

For additional information about service controls, security features and functionalities, including, as applicable, information about storing, retrieving, modifying, restricting, and deleting data, please see https://docs.aws.amazon.com/index.html. This additional information does not form part of the Documentation for purposes of the AWS Customer Agreement available at http://aws.amazon.com/agreement, or other agreement between you and AWS governing your use of AWS’s services.