Infrastructure & Automation

Run code before terminating an EC2 Auto Scaling instance

Often times, you may want to execute some code and actions before terminating an Amazon Elastic Compute Cloud (Amazon EC2) instance that is part of an Amazon EC2 Auto Scaling group. For example, you can remove the instance from the domain and create a snapshot or Amazon Machine Image (AMI) of it before the instance is terminated by dynamic scaling.

One way to execute code and actions before terminating an instance is to create a lifecycle hook that puts the instance in Terminating:Wait status. This allows you to perform any desired actions before immediately terminating the instance within the Auto Scaling group. The Terminating:Wait status can be monitored by an Amazon CloudWatch event, which triggers an AWS Systems Manager automation document to perform the action you want. In this blog post, I create an AMI and remove the Amazon EC2 Windows instance from its domain before terminating it.

In this post, I describe how to manually create this mechanism. If you want to skip directly to the solution, download a copy of the AWS CloudFormation template.

Prerequisites

  • An Amazon EC2 Auto Scaling group.
  • A string parameter for the domain user. The user must have permission to remove the computer from the domain. I call this parameter DomainUserName.
  • A secure string parameter that contains the DomainUserName password. I call this parameter DomainPassword.

Walkthrough

The following are the steps for manually creating the CloudFormation template.

  1. Add a lifecycle hook.
  2. Create a Systems Manager automation document.
  3. Create AWS Identity and Access Management (IAM) policies and a role to delegate permissions to the Systems Manager automation document.
  4. Create IAM policies and a role to delegate permissions to CloudWatch Events, which invokes the Systems Manager automation document.
  5. Create a CloudWatch Events rule.
  6. Add a Systems Manager automation document as a CloudWatch Event target.

Note: The code in this post is intended for the Windows command line interface (CLI). For Linux, replace any occurrences of  ^ with \.

Step 1: Add a lifecycle hook

I use the put-lifecycle-hook command to add a lifecycle hook named my-lifecycle-hook to my Auto Scaling group, My_AutoScalingGroup.

aws autoscaling put-lifecycle-hook ^
--lifecycle-hook-name my-lifecycle-hook ^
--auto-scaling-group-name My_AutoScalingGroup ^
--lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING ^
--default-result CONTINUE ^
--region us-east-2

Step 2: Create a Systems Manager automation document

In this step, I create an automation document named LifeCycleHookDoc. The automation document goes through the following steps.

  1. Run a Windows PowerShell script to remove the computer from the domain.
  2. Create an AMI of the EC2 instance.
  3. Terminate the instance.

To perform these steps, the automation document includes the following parameters.

  • automationAssumeRole — The Amazon Resource Name (ARN) of the role that allows the automation to execute (CloudWatch Events passes this parameter automatically).
  • ASGName — The Auto Scaling group (CloudWatch Events passes this parameter automatically).
  • LCHName — The lifecycle hook (CloudWatch Events passes this parameter automatically).
  • DomainUserName — The string parameter for the domain user. The domain user must have permission to remove the computer from the domain. This parameter was created in the prerequisites section.
  • DomainPassword — The SecureString parameter that has the DomainUserName password. This parameter was created in the prerequisites section.
{
    "description": "This document will disjoin instances From an Active Directory, create an AMI of the instance, send a signal to LifeCycleHook to terminate the instance",
    "schemaVersion": "0.3",
    "assumeRole": "{{automationAssumeRole}}",
    "parameters": {
        "automationAssumeRole": {
            "default": "arn:aws:iam::012345678901:role/automationAssumeRole",
            "description": "(Required) The ARN of the role that allows automation to perform the actions on your behalf.",
            "type": "String"
        },
        "ASGName": {
            "default": "My_AutoScalingGroup",
            "type": "String"
        },
        "InstanceId": {
            "type": "String"
        },
        "LCHName": {
            "default": "my-lifecycle-hook",
            "type": "String"
        },
        "DomainUserName": {
            "default": "UserName",
            "description": "The name of the String Parameter for the domain user. The user would need to have enough permissions to remove the computer from the domain.",
            "type": "String"
        },
        "DomainPassword": {
            "default": "Password",
            "description": "The name of the SecureString Parameter that have the password of domainUserName",
            "type": "String"
        }
    },
    "mainSteps": [
        {
            "inputs": {
                "Parameters": {
                    "executionTimeout": "7200",
                    "commands": [
                        "$name = $env:computerName",
                        "$PartOfDomain = (Get-WmiObject -Class Win32_ComputerSystem).PartOfDomain",
                        "if($PartOfDomain -eq $true){",
                        "$username = (Get-SSMParameterValue -Name {{DomainAdminUserName}}).Parameters[0].Value",
                        "$password = (Get-SSMParameterValue -Name {{DomainAdminPassword}} -WithDecryption $True).Parameters[0].Value | ConvertTo-SecureString -asPlainText -Force",
                        "$credential = New-Object System.Management.automation.PSCredential($username,$password)",
                        "Write-Output \"Removing computer $name from the domain\"",
                        "Remove-Computer -ComputerName $name -Credential $credential -PassThru -Restart -Force}",
                        "else{",
                        "Write-Output \"Cannot remove computer $name because it is not in a domain\"}"
                    ]
                },
                "InstanceIds": [
                    "{{ InstanceId }}"
                ],
                "DocumentName": "AWS-RunPowerShellScript"
            },
            "name": "RunCommand",
            "action": "aws:runCommand"
        },
        {
            "inputs": {
                "ImageName": "{{ InstanceId }}_{{automation:EXECUTION_ID}}",
                "InstanceId": "{{ InstanceId }}",
                "ImageDescription": "My newly created AMI - ASGName: {{ ASGName }}",
                "NoReboot": true
            },
            "name": "createMyImage",
            "action": "aws:createImage"
        },
        {
            "inputs": {
                "LifecycleHookName": "{{ LCHName }}",
                "InstanceId": "{{ InstanceId }}",
                "AutoScalingGroupName": "{{ ASGName }}",
                "Service": "autoscaling",
                "Api": "CompleteLifecycleAction",
                "LifecycleActionResult": "CONTINUE"
            },
            "name": "TerminateTheInstance",
            "action": "aws:executeAwsApi"
        }
    ],
    "outputs": [
        "createAMI.ImageId"
    ]
}

Step 3: Create IAM policies and a role to delegate permissions to the Systems Manager automation document

I create two policies for this role. The first policy is called SSM-automation-Permission-to-CompleteLifecycle-Policy, and it provides permission to CompleteLifecycleAction for any instance within My_AutoScalingGroup. The second policy is called SSM-automation-Policy, and it provides permissions for all the actions needed for the automation document.

Lastly, I create a Systems Manager role called SSM-automation-Role, which must have a trust relationship with AWS Systems Manager. For more information, see Editing the Trust Relationship for an Existing Role.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ssm.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

In SSM-automation-Permission-to-CompleteLifecycle-Policy, update the ARN to match your Auto Scaling group.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "autoscaling:CompleteLifecycleAction"
            ],
            "Resource": "arn:aws:autoscaling:us-east-2:012345678901:autoScalingGroup:*:autoScalingGroupName/My_AutoScalingGroup",
            "Effect": "Allow"
        }
    ]
}

In SSM-automation-Policy, update the AWS Region in the ARN of the AWS-RunPowerShellScript document.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ec2:CreateImage",
                "ec2:DescribeImages",
                "ssm:DescribeInstanceInformation",
                "ssm:ListCommands",
                "ssm:ListCommandInvocations"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ssm:SendCommand"
            ],
            "Resource": "arn:aws:ssm:us-east-2::document/AWS-RunPowerShellScript",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ssm:SendCommand"
            ],
            "Resource": "arn:aws:ec2:*:*:instance/*",
            "Effect": "Allow"
        }
    ]
}

Step 4: Create IAM policies and a role to delegate permissions to CloudWatch Events

I call this role Invoke-SSM-automation-from-CloudWatch-Event, and it contains two policies:

  1. A policy called Start-SSM-automation-Policy that provides permission to execute the automation document LifeCycleHookDoc (from step 2).
  2. A policy called Pass-Role-SSM-Automation-Policy that allows Invoke-SSM-automation-from-CloudWatch-Event to use SSM-automation-Role (from step 3). This is used to invoke LifeCycleHookDoc. For more information, see Pass Role Permissions.

Note: The trust relationship policy must include the Amazon CloudWatch Events service.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "events.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

In Start-SSM-automation-Policy, update the ARN to reflect the correct Region, account ID, and name of the automation document (LifeCycleHookDoc, from Step 2) that contains the PowerShell script.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ssm:StartautomationExecution"
            ],
            "Resource": "arn:aws:ssm:us-east-2:012345678901:automation-definition/LifeCycleHookDoc:$DEFAULT",
            "Effect": "Allow"
        }
    ]
}

In Pass-Role-SSM-Automation-Policy, update the ARN to reflect your account ID.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "iam:PassRole"
            ],
            "Resource": "arn:aws:iam::012345678901:role/SSM-automation-Role",
            "Effect": "Allow"
        }
    ]
}

Step 5: Create a CloudWatch Events rule

To trigger the target (from step 4) when an instance changes to Terminating:Wait status, I create a CloudWatch Events rule called My_CloudWatchEvent.

aws events put-rule ^
--name "My_CloudWatchEvent" ^
--event-pattern "{\"source\":[\"aws.autoscaling\"],\"detail-type\":[\"EC2 Instance-terminate Lifecycle Action\"],\"detail\":{\"AutoScalingGroupName\":[\"My_AutoScalingGroup\"]}}" ^
--region us-east-2

Step 6: Add a Systems Manager automation document as a CloudWatch Event target

I create a CloudWatch Events target using InputTransformer. The InputTransformer feature of CloudWatch Events customizes the text from an event before it is passed to the target of a rule. You can define variables that use the JSON path to reference values in the original event source. You can define multiple variables and assign each a value from the input. Then you can use those variables in the input template as <variable-name>. For more information, see Transforming Target Input.

The following code block is used to configure the CloudWatch Events target. Copy and paste the code into your text editor, and save it as InputTransformer.json. Ensure that you update the ARN for both the automation document and the IAM role to reflect your account information.

[
    {
        "Id": "Id93844851311792",
        "Arn": "arn:aws:ssm:us-east-2:012345678901:automation-definition/LifeCycleHookDoc:$DEFAULT",
        "RoleArn": "arn:aws:iam::012345678901:role/Invoke-SSM-automation-from-CloudWatch-Event",
        "InputTransformer": {
            "InputPathsMap": {
                "asgname": "$.detail.AutoScalingGroupName",
                "instanceid": "$.detail.EC2InstanceId",
                "lchname": "$.detail.LifecycleHookName"
            },
            "InputTemplate": "{\"InstanceId\":[<instanceid>],\"ASGName\":[<asgname>],\"LCHName\":[<lchname>],\"automationAssumeRole\":[\"arn:aws:iam::012345678901:role/SSM-automation-Role\"]}"
        }
    }
]

Run the following to use the InputTransformer.json as a target for My_CloudWatchEvent created above.

aws events put-targets ^
--rule My_CloudWatchEvent ^
--targets file://InputTransformer.json ^
--region us-east-2

You can use CloudWatch to trigger the AWS Systems Manager automation process to create an AMI of the instance and automatically remove the computer from the domain before leaving the AutoScaling group.

Conclusion

In this post, I provided a CloudFormation template and explained how it is used for executing code before an Amazon EC2 Auto Scaling instance terminates. The goal of describing the manual process is to help users better understand the solution so they can modify the code to suit specific needs.

The included AWS CloudFormation template can perform the following tasks.

  • Find Amazon EC2 instances that are being terminated within an Auto Scaling group.
  • Execute an automation document to create an AMI.
  • Remove a computer from a domain.

The template can be used as is, but I encourage you to develop more complicated workflows that further reduce the administrative effort behind unloading resources.

For more information about how to configure an EC2 instance to automatically join a Microsoft Active Directory domain, see How to Configure Your EC2 Instances to Automatically Join a Microsoft Active Directory Domain starting up an EC2 instance. For a way to manage your domain membership of dynamic EC2 instances, see Managing domain membership of dynamic fleet of EC2 instances.