How do I troubleshoot an Amazon EC2 instance that stops or terminates when I try to start it?

5 minute read

When I try to start my Amazon Elastic Compute Cloud (Amazon EC2) instance, it terminates or doesn't start.

Short description

The following reasons are the most common causes of an Amazon EC2 instance InternalError message:

Your Amazon Elastic Block Store (Amazon EBS) volume isn't attached to the instance correctly.
An EBS volume that's attached to the instance is in an error state.
An encrypted EBS volume is attached to the instance.

If your instance doesn't start and no error code appears, then run the describe-instances command in the AWS Command Line Interface (AWS CLI). Then, specify the instance ID. Check the StateReason message that the command returns in the JSON response.

Note: Enter all commands in the AWS CLI. If you receive errors when running AWS CLI commands, make sure that you're using the most recent version of the AWS CLI.

Resolution

EBS volumes aren't attached to the instance correctly

You must attach the EBS root volume to the instance as /dev/sda1 or /dev/xvda, depending on which one is defined in the API. You can't have a second EBS volume with a duplicate or conflicting device name. Otherwise, you can't stop or start the instance. Block device name conflicts affect only Xen-based instance types (c4, m4, t2, and so on). Block device name conflicts don't affect Nitro-based instances (c5, m5, t3, and so on).

1. Run the describe-instances API to verify the StateReason error message and error code:

$ aws ec2 describe-instances --instance-id i-xxxxxxxxxxxxxxx --region us-east-1 --query "Reservations[].Instances[].{StateReason:StateReason}" --output json

Note: Replace us-east-1 with your AWS Region. Replace i-xxxxxxxxxxxxxxx with your instance ID.

If there's a device name conflict, then you see an output that's similar to the following message:

[
    [{
        "StateReason": {
            "Code": "Server.InternalError",
            "Message": "Server.InternalError: Internal error on launch"
        }
    }]
]

2. Open the Amazon EC2 console, and then select the instance that you can't start.

3. On the Description tab, verify the device name listed in Block devices. The Block devices field displays all device names of the attached volumes.

4. Verify that the root device is correctly attached and that there isn't a device listed with the same name or with a conflicting name.

5. If there's a device with a duplicate or conflicting device name, then detach the conflicting volume and rename it. Then, reattach the volume with the updated device name.

An attached EBS volume is in an error state

1. Run the describe-instances API to verify the StateReason error message and error code:

$ aws ec2 describe-instances --instance-id i-xxxxxxxxxxxxxxx --region us-east-1 --query "Reservations[].Instances[].{StateReason:StateReason}" --output json

Note: Replace us-east-1 with your AWS Region. Replace i-xxxxxxxxxxxxxxx with your instance ID.

If there's an attached EBS volume that's in an error state, then you see an output that's similar to the following message:

[
    [{
        "StateReason": {
            "Code": "Server.InternalError",
            "Message": "Server.InternalError: Internal error on launch"
        }
    }]
]

2. Open the Amazon EC2 console, choose Volumes, and then verify if the status of the volume is error. Your options vary depending on whether the volume in an error state is a root volume or a secondary volume.

If the volume that's in an error state is a secondary volume, then detach the volume. You can now start the instance.

If the volume that's in an error state is a root volume and you have a snapshot of the volume, then complete the following steps:

Detach the volume.

Create a new volume from the snapshot.

Attach the new volume to the instance using the device name of the original instance. Start the instance.

Note: If you don’t have an existing snapshot of the root volume that’s in an error state, then you can’t restart the instance. You must launch a new instance, install the relevant applications, and then configure it to replace the old instance.

Attached volumes are encrypted and there are incorrect AWS Identity and Access Management (IAM) permissions or policies

1. Run the describe-instances API to verify the StateReason error message and error code:

$ aws ec2 describe-instances --instance-id i-xxxxxxxxxxxxxxx --region us-east-1 --query "Reservations[].Instances[].{StateReason:StateReason}" --output json

Note: Replace us-east-1 with your AWS Region. Replace i-xxxxxxxxxxxxxxx with your instance ID.

If there's an encrypted volume that's attached to the instance and there are permissions or policy issues, then you receive a client error. You see an output that's similar to the following message:

[
    [{
        "StateReason": {
            "Code": "Client.InternalError",
            "Message": "Client.InternalError: Client error on launch"
        }
    }]
]

2. Verify that the user who's trying to start the instance has the correct IAM permissions. If you launched the instance indirectly through another service, like EC2 Auto Scaling, then also verify the following configurations:

The AWS Key Management Service (AWS KMS) key that's used to encrypt the volume is activated.
The key has the correct key policies.

Note: To verify if a volume is encrypted, open the Amazon EC2 console, and then select Volumes. Encrypted volumes have Encrypted listed in the Encryption column.