How can I troubleshoot a Systems Manager managed instance in Connection Lost status?
Last updated: 2021-03-19
My Amazon Elastic Compute Cloud (Amazon EC2) managed instance is in Connection Lost status under Managed Instances in the AWS Systems Manager console.
A managed instance is an Amazon EC2 instance that is configured for use with Systems Manager. Managed instances can use Systems Manager services such as Run Command, Patch Manager, and Session Manager.
To be a managed instance in Online status, instances must meet the following prerequisites:
- Have the AWS Systems Manager Agent (SSM Agent) installed and running.
- Have connectivity with Systems Manager endpoints using the SSM Agent.
- Have the correct AWS Identity and Access Management (IAM) role attached.
- Have connectivity to the instance metadata service.
Note: For hybrid instances, see Setting up AWS Systems Manager for hybrid environments.
Note: Be sure to select the Region that your instance is in before you begin this resolution.
To troubleshoot a managed instance in Connection Lost status, verify that the following prerequisites are met on the instance:
SSM Agent is installed and running on the instance
To check the status of SSM Agent, use the following commands:
Amazon Linux, RHEL 6 (or similar distributions):
$ sudo status amazon-ssm-agent
Amazon Linux 2, Ubuntu, RHEL 7 (or similar distributions):
$ sudo systemctl status amazon-ssm-agent
Ubuntu 18.04, or later systems that use snap:
$ sudo snap services amazon-ssm-agent
$ Get-Service AmazonSSMAgent
Verify connectivity to Systems Manager endpoints on port 443
The best method to verify connectivity depends on your operating system.
Important: In the following command examples, replace RegionID with your AWS Region ID.
For a list of Systems Manager endpoints by Region, see AWS Systems Manager endpoints and quotas.
Note: In the following examples, the ssmmessages endpoint is required only for AWS Systems Manager Session Manager.
For EC2 Linux instances: You can use either telnet or netcat commands to verify connectivity to endpoints on port 443.
telnet ssm.RegionID.amazonaws.com 443 telnet ec2messages.RegionID.amazonaws.com 443 telnet ssmmessages.RegionID.amazonaws.com 443
nc -vz ssm.RegionID.amazonaws.com 443 nc -vz ec2messages.RegionID.amazonaws.com 443 nc -vz ssmmessages.RegionID.amazonaws.com 443
Note: Netcat doesn't come preinstalled on Amazon EC2 instances. To manually install Netcat, see Ncat on the Nmap website.
For EC2 Windows instances: You can use the following Windows PowerShell commands to verify connectivity to endpoints on port 443:
Test-NetConnection ssm.RegionID.amazonaws.com -port 443 Test-NetConnection ec2messages.RegionID.amazonaws.com -port 443 Test-NetConnection ssmmessages.RegionID.amazonaws.com -port 443
For public subnets: Systems Manager endpoints are public endpoints. This means that your instance must be able to reach the internet using an internet gateway. For issues connecting to the endpoints from instances in a public subnet, confirm the following:
- The route table that your instance uses must contain a route to the internet.
- Your virtual private cloud (VPC) security groups and network access control lists (network ACLs) must be configured to allow outbound connections on port 443.
For private subnets: For private subnets, your instance must be able to reach the internet using a NAT gateway. Or, you can configure VPC endpoints to reach Systems Manager endpoints for instances in a private subnet. This allows you to privately access Amazon EC2 and Systems Manager APIs using private IP addresses. For more information, see How do I create VPC endpoints so that I can use Systems Manager to manage private EC2 instances without internet access?
Verify that the correct IAM role is attached to the instance
To make APIs calls to a Systems Manager endpoint, you must attach AmazonSSMManagedInstanceCore permissions policy to the IAM role attached to your instance. If you're using a custom IAM policy, confirm that your custom policy uses the permissions found under AmazonSSMManagedInstanceCore. Also, make sure that the trust policy of the IAM role allows ec2.amazonaws.com to assume this role.
For more information, see Add permissions to a Systems Manager instance profile (console).
Verify connectivity to the instance metadata service
SSM Agent must communicate with the instance metadata service in order to get necessary information about the instance. Use the Netcat command to test the connection:
nc -vz 169.254.169.254 80
If you're using a proxy on the instance, the proxy might block connectivity to the metadata URL. Confirm that you configured your SSM Agent to work with a proxy. To configure SSM Agent to use a proxy, see:
If the instance status doesn't change to Online and still indicates Connection Lost, then refer to the SSM Agent logs to troubleshoot further:
Windows: The SSM Agent logs for Windows are found under %PROGRAMDATA%\Amazon\SSM\Logs.
Linux: The SSM Agent logs for Linux are found under /var/log/amazon/ssm.