AWS DevOps & Developer Productivity Blog

Faster Auto Scaling in AWS CloudFormation Stacks with Lambda-backed Custom Resources

Many organizations use AWS CloudFormation (CloudFormation) stacks to facilitate blue/green deployments, routinely launching replacement AWS resources with updated packages for code releases, security patching, and change management. To facilitate blue/green deployments with CloudFormation, you typically pass code version identifiers (e.g., a commit hash) to new application stacks as template parameters. Application servers in an Auto Scaling group reference the parameters to fetch and install the correct versions of code.
 
Fetching code every time your application scales can impede bringing new application servers online. Organizations often compensate for reduced scaling agility by setting lower server utilization targets, which has a knock-on effect on cost, or by creating pre-built custom Amazon Machine Images (AMIs) for use in the deployment pipeline. Custom AMIs with pre-installed code can be referenced with new instance launches as part of an Auto Scaling group launch configuration. These application servers are ready faster than if code had to be fetched in the traditional way. However, hosting this type of application deployment pipeline often requires additional servers and adds the overhead of managing the AMIs.
 
In this post, we’ll look at how you can use CloudFormation custom resources with AWS Lambda (Lambda) to create and manage AMIs during stack creation and termination.
 
The following diagram shows how you can use a Lambda function that creates an AMI and returns a success code and the resulting AMI ID.
 
Visualization of AMIManager Custom Resource creation process
 
To orchestrate this process, you bootstrap a reference instance with a user data script, use wait conditions to trigger an AMI capture, and finally create an Auto Scaling group launch configuration that references the newly created AMI. The reference instance that is used to capture the AMI can then be terminated, or it can be repurposed for administrative access or for performing scheduled tasks. Here’s how this looks in a CloudFormation template:
 
"Resources": {
  "WaitHandlePendingAMI" : {
    "Type" : "AWS::CloudFormation::WaitConditionHandle"
  },
  "WaitConditionPendingAMI" : {
    "Type" : "AWS::CloudFormation::WaitCondition",
    "Properties" : {
      "Handle"  : { "Ref" : "WaitHandlePendingAMI" },
      "Timeout" : "7200"
    }
  },

  "WaitHandleAMIComplete" : {
    "Type" : "AWS::CloudFormation::WaitConditionHandle"
  },
  "WaitConditionAMIComplete" : {
    "Type" : "AWS::CloudFormation::WaitCondition",
    "Properties" : {
      "Handle"  : { "Ref" : "WaitHandleAMIComplete" },
      "Timeout" : "7200"
    }
  },

  "AdminServer" : {
    "Type" : "AWS::EC2::Instance",
    "Properties" : {
      ...
      "UserData": { "Fn::Base64": { "Fn::Join": [ "", [
        "#!/bin/bashn",
        "yum update -yn",
        "",
        "echo -e "n### Fetching and Installing Code..."n",
        "export CODE_VERSION="", {"Ref": "CodeVersionIdentifier"}, ""n",
        "# Insert application deployment code here!n",
        "",
        "echo -e "n### Signal for AMI capture"n",
        "history -cn",
        "/opt/aws/bin/cfn-signal -e 0 -i waitingforami '", { "Ref" : "WaitHandlePendingAMI" }, "' n",
        "",
        "echo -e "n### Waiting for AMI to be available"n",
        "aws ec2 wait image-available",
        "    --filters Name=tag:cloudformation:amimanager:stack-name,Values=", { "Ref" : "AWS::StackName" },
        "    --region ", {"Ref": "AWS::Region"}
        "",
        "/opt/aws/bin/cfn-signal -e $0 -i waitedforami '", { "Ref" : "WaitHandleAMIComplete" }, "' n"
        "",
        "# Continue with re-purposing or shutting down instance...n"
      ] ] } }
    }
  },

  "AMI": {
    "Type": "Custom::AMI",
    "DependsOn" : "WaitConditionPendingAMI",
    "Properties": {
      "ServiceToken": "arn:aws:lambda:REGION:ACCOUNTID:function:AMIManager",
      "StackName": { "Ref" : "AWS::StackName" },
      "Region" : { "Ref" : "AWS::Region" },
      "InstanceId" : { "Ref" : "AdminServer" }
    }
  },

  "AutoScalingGroup" : {
    "Type" : "AWS::AutoScaling::AutoScalingGroup",
    "Properties" : {
      ...
      "LaunchConfigurationName" : { "Ref" : "LaunchConfiguration" }
    }
  },

  "LaunchConfiguration": {
    "Type": "AWS::AutoScaling::LaunchConfiguration",
    "DependsOn" : "WaitConditionAMIComplete",
    "Properties": {
      ...
      "ImageId": { "Fn::GetAtt" : [ "AMI", "ImageId" ] }
    }
  }
}

With this approach, you don’t have to run and maintain additional servers for creating custom AMIs, and the AMIs can be deleted when the stack terminates. The following figure shows that as CloudFormation deletes the stacks, it also deletes the AMIs when the Delete signal is sent to the Lambda-backed custom resource.

Visualization of AMIManager Custom Resource deletion process

Let’s look at the Lambda function that facilitates AMI creation and deletion:

/**
* A Lambda function that takes an AWS CloudFormation stack name and instance id
* and returns the AMI ID.
**/

exports.handler = function (event, context) {

    console.log("REQUEST RECEIVED:n", JSON.stringify(event));

    var stackName = event.ResourceProperties.StackName;
    var instanceId = event.ResourceProperties.InstanceId;
    var instanceRegion = event.ResourceProperties.Region;

    var responseStatus = "FAILED";
    var responseData = {};


    var AWS = require("aws-sdk");
    var ec2 = new AWS.EC2({region: instanceRegion});

    if (event.RequestType == "Delete") {
        console.log("REQUEST TYPE:", "delete");
        if (stackName && instanceRegion) {
            var params = {
                Filters: [
                    {
                        Name: 'tag:cloudformation:amimanager:stack-name',
                        Values: [ stackName ]
                    },
                    {
                        Name: 'tag:cloudformation:amimanager:stack-id',
                        Values: [ event.StackId ]
                    },
                    {
                        Name: 'tag:cloudformation:amimanager:logical-id',
                        Values: [ event.LogicalResourceId ]
                    }
                ]
            };
            ec2.describeImages(params, function (err, data) {
                if (err) {
                    responseData = {Error: "DescribeImages call failed"};
                    console.log(responseData.Error + ":n", err);
                    sendResponse(event, context, responseStatus, responseData);
                } else if (data.Images.length === 0) {
                    sendResponse(event, context, "SUCCESS", {Info: "Nothing to delete"});
                } else {
                    var imageId = data.Images[0].ImageId;
                    console.log("DELETING:", data.Images[0]);
                    ec2.deregisterImage({ImageId: imageId}, function (err, data) {
                        if (err) {
                            responseData = {Error: "DeregisterImage call failed"};
                            console.log(responseData.Error + ":n", err);
                        } else {
                            responseStatus = "SUCCESS";
                            responseData.ImageId = imageId;
                        }
                        sendResponse(event, context, "SUCCESS");
                    });
                }
            });
        } else {
            responseData = {Error: "StackName or InstanceRegion not specified"};
            console.log(responseData.Error);
            sendResponse(event, context, responseStatus, responseData);
        }
        return;
    }

    console.log("REQUEST TYPE:", "create");
    if (stackName && instanceId && instanceRegion) {
        ec2.createImage(
            {
                InstanceId: instanceId,
                Name: stackName + '-' + instanceId,
                NoReboot: true
            }, function (err, data) {
                if (err) {
                    responseData = {Error: "CreateImage call failed"};
                    console.log(responseData.Error + ":n", err);
                    sendResponse(event, context, responseStatus, responseData);
                } else {
                    var imageId = data.ImageId;
                    console.log('SUCCESS: ', "ImageId - " + imageId);

                    var params = {
                        Resources: [imageId],
                        Tags: [
                            {
                                Key: 'cloudformation:amimanager:stack-name',
                                Value: stackName
                            },
                            {
                                Key: 'cloudformation:amimanager:stack-id',
                                Value: event.StackId
                            },
                            {
                                Key: 'cloudformation:amimanager:logical-id',
                                Value: event.LogicalResourceId
                            }
                        ]
                    };
                    ec2.createTags(params, function (err, data) {
                        if (err) {
                            responseData = {Error: "Create tags call failed"};
                            console.log(responseData.Error + ":n", err);
                        } else {
                            responseStatus = "SUCCESS";
                            responseData.ImageId = imageId;
                        }
                        sendResponse(event, context, responseStatus, responseData);
                    });
                }
            }
        );
    } else {
        responseData = {Error: "StackName, InstanceId or InstanceRegion not specified"};
        console.log(responseData.Error);
        sendResponse(event, context, responseStatus, responseData);
    }
};

//Sends response to the Amazon S3 pre-signed URL
function sendResponse(event, context, responseStatus, responseData) {
   var responseBody = JSON.stringify({
        Status: responseStatus,
        Reason: "See the details in CloudWatch Log Stream: " + context.logStreamName,
        PhysicalResourceId: context.logStreamName,
        StackId: event.StackId,
        RequestId: event.RequestId,
        LogicalResourceId: event.LogicalResourceId,
        Data: responseData
    });

    console.log("RESPONSE BODY:n", responseBody);

    var https = require("https");
    var url = require("url");

    var parsedUrl = url.parse(event.ResponseURL);
    var options = {
        hostname: parsedUrl.hostname,
        port: 443,
        path: parsedUrl.path,
        method: "PUT",
        headers: {
            "content-type": "",
            "content-length": responseBody.length
        }
    };

    var request = https.request(options, function (response) {
        console.log("STATUS: " + response.statusCode);
        console.log("HEADERS: " + JSON.stringify(response.headers));
        // Tell AWS Lambda that the function execution is done
        context.done();
    });

    request.on("error", function (error) {
        console.log("sendResponse Error:n", error);
        // Tell AWS Lambda that the function execution is done
        context.done();
    });

    // Write data to request body
    request.write(responseBody);
    request.end();
}

This Lambda function calls the Amazon EC2 DescribeImages, DeregisterImage, CreateImage, and CreateTags APIs, and logs data to Amazon CloudWatch Logs (CloudWatch Logs) for monitoring and debugging. To support this, we recommended that you create the following AWS Identity and Access Management (IAM) policy for the function’s IAM execution role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ec2:CreateImage",
        "ec2:DeregisterImage",
        "ec2:DescribeImages",
        "ec2:CreateTags"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Action": [
        "logs:*"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:logs:*:*:*"
    }
  ]
}

During testing, the Lambda function didn’t exceed the minimum Lambda memory allocation of 128 MB. Typically, create operations took 4.5 seconds, and delete operations took 25 seconds. At Lambda’s current pricing of $0.00001667 per GB-second, each stack’s launch and terminate cycle incurs custom AMI creation costs of just $0.000988. This is much less expensive than managing an independent code release application. Within the AWS Free Tier, using Lambda as described allows you to perform more than 9,000 custom AMI create and delete operations each month for free!