What are some best practices for implementing AWS Lambda-backed custom resources with AWS CloudFormation?
Last updated: 2019-06-28
Consider the following best practices:
Build your custom resources to report, log, and handle failure gracefully
Exceptions can cause your function code to exit without sending a response. AWS CloudFormation requires an HTTPS response to confirm if the operation is a success or a failure. An unreported exception will cause AWS CloudFormation to wait until the operation times out before starting a stack rollback. If the exception occurs again on rollback, AWS CloudFormation waits again for a timeout before ending in a rollback failure. During this time, your stack is unusable.
To avoid timeout issues that can be time-consuming to troubleshoot, include the following in the code you create for your Lambda function:
- Logic to handle exceptions
- The ability to log the failure for troubleshooting scenarios
- The ability to respond to AWS CloudFormation with an HTTPS response confirming that an operation failed
- A Dead Letter Queue that allows you to capture and deal with uncompleted executions
Set reasonable timeout periods, and report when they're about to be exceeded
If an operation doesn't run within its defined timeout period, the function raises an exception and no response is sent to AWS CloudFormation.
To avoid this issue, consider the following:
- Set the timeout value for your Lambda functions high enough to handle variations in processing time and network conditions.
- Set a timer in your function to respond to AWS CloudFormation with an error when a function is about to time out. A timer can help prevent delays for custom resources.
Understand and build around Create, Update, and Delete events
Depending on the stack action, AWS CloudFormation sends your function a Create, Update, or Delete event. Each event is handled differently, so be sure that there are no unintended behaviors when any of the three event types is received.
For more information, see Custom Resource Request Types.
Understand how AWS CloudFormation identifies and replaces resources
When an update triggers replacement of a physical resource, AWS CloudFormation compares the PhysicalResourceId returned by your Lambda function to the previous PhysicalResourceId. If the IDs differ, AWS CloudFormation assumes the resource has been replaced with a new physical resource.
However, the old resource is not implicitly removed to allow a rollback if necessary. When the stack update is completed successfully, a Delete event request is sent with the old physical ID as an identifier. If the stack update fails and a rollback occurs, the new physical ID is sent in the Delete event.
Carefully consider when you return a new PhysicalResourceId. Use PhysicalResourceId to uniquely identify resources so that only the correct resources are deleted during a replacement update when a Delete event is received.
Design your functions with idempotency in mind
An idempotent function can be repeated any number of times with the same inputs, and the result will be the same as if it had been done only once. Idempotency is valuable when working with AWS CloudFormation to ensure that retries, updates, and rollbacks don't create duplicate resources or introduce errors.
For example, assume AWS CloudFormation invokes your function to create a resource, but doesn't receive a response that the resource was created successfully. AWS CloudFormation might invoke the function again and create a second resource. The first resource may become orphaned.
Implement your handlers to correctly handle rollbacks
If a stack operation fails, AWS CloudFormation attempts to roll back and revert all resources to their prior state. This results in different behaviors depending on whether the update caused a resource replacement.
To help ensure that rollbacks are completed smoothly, consider the following: