I am bootstrapping an Amazon EC2 Windows instance using the cfn-init.exe script in AWS CloudFormation and signaling back with cfn-signal.exe. One of the steps executed by cfn-init.exe results in a reboot of the instance. Sometimes it signals back too early, and CloudFormation marks the Instance as CREATE_COMPLETE before it has finished bootstrapping. Both cfn-init.exe and cfn-signal.exe are being executed from UserData.

In an EC2 Windows instance, UserData scripts are executed by the Ec2ConfigService process. When cfn-init.exe is invoked from UserData, it runs as a child process of Ec2ConfigService. If one of the steps executed by cfn-init.exe requires a system reboot, occasionally the system starts to shut down and hands execution back to the Ec2ConfigService process. It continues processing the UserData script and executes cfn-signal.exe to signal back to CloudFormation. This race condition may or may not occur depending on how quickly the system shuts down during the reboot operation.

Here is an example of a typical call of cfn-init.exe and cfn-signal.exe from UserData:

"UserData": { "Fn::Base64": { "Fn::Join": ["", [

    "\",

    " cfn-init.exe -v -s ", { "Ref": "AWS::StackName" },

    "  -r Instance",

    "  --region ", { "Ref" : "AWS::Region" }, "\",

    " cfn-signal.exe -e %ERRORLEVEL% ", {"Fn::Base64": { "Ref": "WaitHandle" }}, "\",

    ""

]]}}

Move cfn-signal.exe out of UserData and execute it as the last command run by cfn-init.exe; for example:

"WindowsServer": {

    ...

    "Type": "AWS::EC2::Instance",

    "Metadata": {

        "AWS::CloudFormation::Init": {

            "config": {

                "commands": {

                    "0-restart": {

                        "command": "powershell.exe -Command Restart-Computer",

                        "waitAfterCompletion": "forever"

                    },

                    "1-signal-success": {

                        "command": {

                            "Fn::Join": ["", [

                                "cfn-signal.exe -e 0 --stack ", { "Ref": "AWS::StackId" },

                               " --resource WindowsServer --region ", { "Ref": "AWS::Region" } ] ]

                        }

                    }

                }

            }

        }

    }

}

We do lose the ability to capture the overall %ERRORLEVEL% of the cfn-init.exe process. However, if the process fails, no success signal is sent back to the CloudFormation stack within the specified timeout.

CloudFormation, Windows, EC2, signal, intermittent, cfn-signal.exe


Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2016-09-30