My EC2 Linux instance failed its system status check. How do I troubleshoot this?

Last updated: 2020-06-02

My Amazon Elastic Compute Cloud (Amazon EC2) instance failed its system status check and is no longer accessible. How do I troubleshoot system status check failures?

Short Description

System status check failures indicate that there is an issue with the hardware hosting your EC2 instance.

Resolution

The instance must be migrated to a new, healthy host by stopping and starting the instance. You can wait for Amazon EC2 to perform the stop and start of your instance. Or, you can manually stop and start the instance to migrate it to a new, healthy host.

Note: A stop and start isn't equivalent to a reboot. A start is required to migrate the instance to healthy hardware.

Warning: Before stopping and starting your instance, be sure you understand the following:

  • Instance store data is lost when you stop and start an instance. If your instance is instance store-backed or has instance store volumes containing data, the data is lost when you stop the instance. For more information, see Determining the root device type of your instance.
  • If your instance is part of an Amazon EC2 Auto Scaling group, stopping the instance may terminate the instance. If you launched the instance with Amazon EMR, AWS CloudFormation, or AWS Elastic Beanstalk, your instance might be part of an AWS Auto Scaling group. Instance termination in this scenario depends on the instance scale-in protection settings for your Auto Scaling group. If your instance is part of an Auto Scaling group, then temporarily remove the instance from the Auto Scaling group before starting the resolution steps.
  • Stopping and starting the instance changes the public IP address of your instance. It's a best practice to use an Elastic IP address instead of a public IP address when routing external traffic to your instance. If you are using Route 53, you might have to update the Route 53 DNS records when the public IP changes.
  • If the shutdown behavior of the instance is set to Terminate, the instance is terminated if you shut down the instance from the operating system using the shutdown or poweroff command. To avoid this, change the instance shutdown behavior.

In rare circumstances, the infrastructure-layer issue can prevent the underlying host from responding to the stop and start API calls. This causes the instance to be stuck in the stopping state. For instructions on how to force the instance to stop, see Troubleshooting stopping your instance.

You can create an Amazon CloudWatch alarm that monitors and automatically recovers the EC2 instance from issues that involve underlying hardware failure.