How can I troubleshoot a Systems Manager managed instance in Connection Lost status?

Last updated: 2021-06-21

My Amazon Elastic Compute Cloud (Amazon EC2) managed instance is in Connection Lost status under Managed Instances in the AWS Systems Manager console.

Short description

A managed instance is an Amazon EC2 instance that is configured for use with Systems Manager. Managed instances can use Systems Manager services such as Run Command, Patch Manager, and Session Manager.

To be a managed instance in Online status, instances must meet the following prerequisites:

  • Have the AWS Systems Manager Agent (SSM Agent) installed and running.
  • Have connectivity with Systems Manager endpoints using the SSM Agent.
  • Have the correct AWS Identity and Access Management (IAM) role attached.
  • Have connectivity to the instance metadata service.

You can also run the AWSSupport-TroubleshootManagedInstance Systems Manager Automation document to confirm whether the instance meets the prerequisites to be listed as a managed instance. For more information, see AWSSupport-TroubleshootManagedInstance.

Note: For hybrid instances, see Setting up AWS Systems Manager for hybrid environments.

Resolution

Note: Be sure to select the Region that your instance is in before you begin this resolution.

To troubleshoot a managed instance in Connection Lost status, verify that the following prerequisites are met on the instance:

SSM Agent is installed and running on the instance

To check the status of SSM Agent, use the following commands:

Amazon Linux, RHEL 6 (or similar distributions):

$ sudo status amazon-ssm-agent

Amazon Linux 2, Ubuntu, RHEL 7 (or similar distributions):

$ sudo systemctl status amazon-ssm-agent

Ubuntu 18.04, or later systems that use snap:

$ sudo snap services amazon-ssm-agent

Windows:

$ Get-Service AmazonSSMAgent

Verify connectivity to Systems Manager endpoints on port 443

The best method to verify connectivity depends on your operating system.

Important: In the following command examples, replace RegionID with your AWS Region ID.

For a list of Systems Manager endpoints by Region, see AWS Systems Manager endpoints and quotas.

Note: In the following examples, the ssmmessages endpoint is required only for AWS Systems Manager Session Manager.

For EC2 Linux instances: You can use either telnet or netcat commands to verify connectivity to endpoints on port 443.

Telnet

telnet ssm.RegionID.amazonaws.com 443
telnet ec2messages.RegionID.amazonaws.com 443
telnet ssmmessages.RegionID.amazonaws.com 443

Example successful connection:

root@111800186:~# telnet ssm.us-east-1.amazonaws.com 443
Trying 52.46.141.158...
Connected to ssm.us-east-1.amazonaws.com.
Escape character is '^]'.

To exit from telnet, hold down the Ctrl key and press the ] key. Enter quit, and then press Enter.

Netcat

nc -vz ssm.RegionID.amazonaws.com 443
nc -vz ec2messages.RegionID.amazonaws.com 443
nc -vz ssmmessages.RegionID.amazonaws.com 443

Note: Netcat doesn't come preinstalled on Amazon EC2 instances. To manually install Netcat, see Ncat on the Nmap website.

For EC2 Windows instances: You can use the following Windows PowerShell commands to verify connectivity to endpoints on port 443:

Test-NetConnection ssm.RegionID.amazonaws.com -port 443
Test-NetConnection ec2messages.RegionID.amazonaws.com -port 443
Test-NetConnection ssmmessages.RegionID.amazonaws.com -port 443

For public subnets: Systems Manager endpoints are public endpoints. This means that your instance must be able to reach the internet using an internet gateway. For issues connecting to the endpoints from instances in a public subnet, confirm the following:

For private subnets: For private subnets, your instance must be able to reach the internet using a NAT gateway. Or, you can configure VPC endpoints to reach Systems Manager endpoints for instances in a private subnet. This allows you to privately access Amazon EC2 and Systems Manager APIs using private IP addresses. For more information, see How do I create VPC endpoints so that I can use Systems Manager to manage private EC2 instances without internet access?

Verify that the correct IAM role is attached to the instance

To make APIs calls to a Systems Manager endpoint, you must attach AmazonSSMManagedInstanceCore permissions policy to the IAM role attached to your instance. If you're using a custom IAM policy, confirm that your custom policy uses the permissions found under AmazonSSMManagedInstanceCore. Also, make sure that the trust policy of the IAM role allows ec2.amazonaws.com to assume this role.

For more information, see Add permissions to a Systems Manager instance profile (console).

Verify connectivity to the instance metadata service

SSM Agent must communicate with the instance metadata service in order to get necessary information about the instance. Use the Netcat command to test the connection:

nc -vz 169.254.169.254 80

If you're using a proxy on the instance, the proxy might block connectivity to the metadata URL. Confirm that you configured your SSM Agent to work with a proxy. To configure SSM Agent to use a proxy, see:

Windows: Configure SSM Agent to use a proxy for Windows Server instances

Linux: Configure SSM Agent to use a proxy (Linux)

Troubleshooting

If the instance status doesn't change to Online and still indicates Connection Lost, then refer to the SSM Agent logs to troubleshoot further:

Windows: The SSM Agent logs for Windows are found under %PROGRAMDATA%\Amazon\SSM\Logs.

Linux: The SSM Agent logs for Linux are found under /var/log/amazon/ssm.


Did this article help?


Do you need billing or technical support?