How do I stop AWS OpsWorks Stacks from unexpectedly restarting healthy instances?
Last updated: 2021-08-17
AWS OpsWorks Stacks restarts my Amazon Elastic Compute Cloud (Amazon EC2) instances, even if the instances pass Amazon EC2 health checks. Why is this happening and how do I stop it?
If the OpsWorks Stacks auto healing feature is activated and the service determines that an instance it manages fails, one of the following occurs:
- If the instance is backed by Amazon Elastic Block Store (Amazon EBS), then the OpsWorks Stacks API stops and starts the failed instance.
- If the instance is backed by an Amazon EC2 instance store, the instance is terminated. Then, the instance is recreated when OpsWorks Stacks starts the instance again.
- If the instance is registered with an OpsWorks stack and is on-premises, then the instance's status is changed to connection lost, but isn't restarted.
To prevent OpsWorks Stacks from auto healing the instances that it manages, first follow the troubleshooting steps in this article. If the problem persists, you can also turn off auto healing in your OpsWorks Stacks layer's General Settings.
For more information, see Instances unexpectedly restart in the AWS OpsWorks debugging and troubleshooting guide.
Verify that the Amazon EC2 instances that are managed by OpsWorks Stacks have internet access
If an Amazon EC2 instance loses its connection to the OpsWorks Stacks service, then OpsWorks Stacks treats the instance as failed.
To verify that your Amazon EC2 instances have internet access, do the following:
- Make sure that your instances have access to the internet through either an internet gateway or Network Address Translation (NAT) gateway.
- Verify that inbound HTTPS access is allowed through port 443 at the instance, security group, and network access control list (network ACL) level.
To troubleshoot NAT gateway connectivity issues, see Why can't my EC2 instances access the internet using a NAT gateway?
To troubleshoot internet gateway connectivity issues, see Why can't my Amazon EC2 instance connect to the internet using an internet gateway?
Verify that your application has enough memory and CPU capacity at the instance level to function when the instance is under extra load
To review your instances' metrics, follow the instructions in Monitoring stacks using Amazon CloudWatch.
To set alarms to warn you if your instance has a high load of CPU, memory, or network traffic, see Creating Amazon CloudWatch alarms.
Verify that the Amazon EC2 instance wasn't stopped outside of the OpsWorks Stacks console or the OpsWorks Stacks API
Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you’re using the most recent AWS CLI version.
If an OpsWorks Stacks-managed instance is stopped in the Amazon EC2 console, then OpsWorks Stacks stops receiving the keepalive signal from the OpsWorks agent. OpsWorks Stacks then treats the instance as failed.
To verify if your instance was stopped in the Amazon EC2 console, then try stopping the instance in the OpsWorks Stacks console. If the instance is in the stop_failed state and you receive an Internal Error message, then the instance was stopped in the Amazon EC2 console.
To stop an instance in OpsWorks Stacks after it has been stopped in the Amazon EC2 console, run the AWS CLI stop-instance command.
Important: The stop-instance command must include the --force parameter for this use case.
Verify that the Amazon EC2 instance uses Instance Metadata Service Version 1 (IMDSv1)
OpsWorks Stacks supports IMDSv1 only, not IMDSv2. If an OpsWorks Stacks-managed instance uses IMDSv2, OpsWorks Stacks treats the instance as failed.
To check what metadata service your instance uses and to reconfigure the instance if needed, see Configure the instance metadata options.