AWS Compute Blog

Using the circuit-breaker pattern with AWS Lambda extensions and Amazon DynamoDB

This post is written by Alan Oberto Jimenez, Senior Cloud Application Architect, and Tobias Drees, Cloud Application Architect.

Modern software systems frequently rely on remote calls to other systems across networks. When failures occur, they can cascade across multiple services causing service disruptions. One technique for mitigating this risk is the circuit breaker pattern, which can detect and isolate failures in a distributed system. The circuit breaker pattern can help prevent cascading failures and improve overall system stability.

The pattern isolates the failing service and thus prevents cascading failures. It improves the overall responsiveness by preventing long waiting times for timeout periods. Furthermore, it also increases the fault tolerance of the system since it lets the system interact with the affected service again once it is available again.

This blog post presents an example application, showing how AWS Lambda extensions integrate with Amazon DynamoDB to implement the circuit breaker pattern.

Using Lambda extensions to implement the circuit breaker pattern

AWS Lambda extensions provide a way to integrate monitoring, observability, security, and governance tools into the Lambda execution environment without complex installation or configuration management. You can run extensions both as part of the runtime process with an internal extension or as a separate process in the execution environment with an external extension.

Lambda extensions enable the circuit breaker pattern without modifying the core function code. An external extension checks in a separate runtime whether a certain service is reachable or not. This approach decouples the business logic in the Lambda function from failure detection, allowing for the reuse of this Lambda extension across different Lambda functions. Both decoupling of code with different purposes and code reuse is in line with the best practices for building Lambda functions.

Pinging a microservice at each Lambda invocation increases network traffic and latency. Circuit breaker implementations benefit from a caching layer to store the state of the microservices. The Lambda extension fetches the status of a microservice from a database and stores the result in memory for a specified time avoiding a disk write. The Lambda function checks the extension cache before pinging the microservice reducing network traffic. Lambda extensions are an ideal tool to build a caching layer for Lambda functions since its in-memory cache makes it more secure, easier to manage, and more performant due to higher availability compared to calling a network resource instead.

Overview

Architecture Overview

  1. The main function process handles the event after every AWS Lambda invocation. Before performing any external call against the external components, it listens for HTTP POST events from the Lambda extension process to fetch the last status of the circuits.
  2. The extension process provides the circuit state to the main process via HTTP POST.
    1. The extension checks its internal cache and returns a valid value if available, otherwise reads the state of the circuits from the DynamoDB table and updates the cache.
    2. Finally, the extension process returns the state of the circuits to the main function via an API call response.
    3. Because of the Lambda extensions lifecycle, this process occurs periodically to keep the local cache updated until the execution environment is terminated.
  3. If the circuit is in the OPEN state, the main function process executes calls against the external microservices, otherwise the process returns a local response.
  4. An Amazon EventBridge event periodically invokes a Lambda responsible for updating the circuit states.
  5. This Lambda function performs the validations needed to determine the status of the different remote microservices (circuits) with an Amazon API Gateway entrypoint.
  6. The Lambda function writes the result of the verification process to the DynamoDB table.

Walkthrough

The following prerequisites are required to complete the walkthrough:

  • An active AWS account
  • AWS CLI 2.15.17 or later
  • AWS SAM CLI 1.116.0 or later
  • Git 2.39.3 or later
  • Python 3.12

Initial setup

  1. Clone the code from GitHub onto a local machine:
    git clone https://github.com/aws-samples/implementing-the-circuit-breaker-pattern-with-lambda-extensions-and-dynamodb.git
  2. To install the packages, utilize a virtual environment:
    python -m venv circuit_breaker_venv && source circuit_breaker_venv/bin/activate
  3. To prepare the services for deployment, execute the following AWS Serverless Application Model (SAM) command:
    sam build
  4. To deploy the services, use this command specifying the AWS CLI profile (in the config file in the .aws folder) for the AWS account to deploy the services in:
    sam deploy --guided --profile <AWSProfile>

    Answer the question prompts as appropriate.

  5. You can deploy subsequent local changes in the code with:
    sam build 
    sam deploy

Testing and adjusting the solution

The Lambda function updating the state in DynamoDB runs every minute as specified by the template. After the function has run for the first time after 1 minute, the DynamoDB entry containing the status (“OPEN” or “CLOSED”) is ready. Since the mock API is part of the stack, the status is “OPEN”.

You can invoke the My Microservice Lambda function manually to see:

Response

The Lambda function updating the state in DynamoDB is invoked with an EventBridge rule that specifies the URL and the ID of the service to be monitored. By creating a new EventBridge rule with the correct URL and a new ID, you can use the AWS SAM template for monitoring multiple services.

To add a new EventBridge rule, add this to the template:

  NewEventRule:
    Type: AWS::Events::Rule
    Properties:
      Description: Event rule to trigger the Lambda function with a JSON payload
      ScheduleExpression: rate(1 minute) 
      State: ENABLED
      Targets:
        - Arn: !GetAtt UpdatingStateLambda.Arn
          Id: TargetFunction
          Input: '{ "URL": "https://aws.amazon.com/", "ID": "NewMicroservice"}'  # Add the JSON payload here

  MyPermissionForNewEventRule:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref UpdatingStateLambda
      Action: lambda:InvokeFunction
      Principal: events.amazonaws.com
      SourceArn: !GetAtt NewEventRule.Arn    

In the Lambda function that contains the business logic, add the following environment variables. However, for more complex cases with multiple microservices to be monitored, it’s recommended to use AWS Config. Using AWS Config, configurations for Lambda functions can be stored to enable more granular control than with environment variables.

Environment:
        Variables:
          service_name: "NewMicroservice"

You can adjust the logic of this Lambda function by changing the code in my-microservice/lambda-handler.py or directly in the Lambda section of the AWS Management Console.

If you end up using your own Lambda function to use the circuit breaker Lambda extension, include the circuit breaker extension as a layer:

BusinessLogicMicroservice:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: business-logic-microservice/
      Handler: lambda_function.lambda_handler
      MemorySize: 128
      Policies:
      - DynamoDBCrudPolicy:
          TableName: !Ref CircuitBreakerStateTable
      Timeout: 100
      Runtime: python3.8
      Layers:
      - !Ref CircuitBreakerExtensionLayer

Circuit breaker in closed state

So far, the sample application only features an open circuit breaker state signaling a functioning microservice. This section simulates an unresponsive microservice to test the behavior of the system with a closed-circuit breaker state.

  1. Edit the environment variables of the MyMicroservice Lambda function in line 47 of the template.yaml file and the URL of the input to the Lambda updating the state in the event rule in line 107 to a domain that times out such as ”https://aws.amazon.com:81/“.
    API_URL: "https://aws.amazon.com:81/"
    Input: '{ "URL": "https://aws.amazon.com:81/", "ID": "MyMicroservice"}'
    
  2. Deploy these changes:
    sam build
    sam deploy

The event rule invokes the Lambda function, updating the state every minute. To see the output of this Lambda function, invoke it manually:

Execution result

This Lambda function changes the DynamoDB entry for this URL to:

DynamoDB entry

The MyMicroservice Lambda function receives the DynamoDB entries for the status over HTTP from the Circuit Breaker Lambda extension and proceeds with the logic following a closed state. The output of invoking the Lambda manually is:

Manual output

This shows the circuit breaker pattern working as intended. In the Lambda updating state, the time it takes for the Lambda function to throw a timeout exception is defined as 4 seconds and can be adjusted to the use case.

requests.get(API_URL, headers=headers, timeout=4)

Clean-up

To delete all resources from this stack, run:

sam delete --stack-name new-circuit-breaker-sam-stack

Security

The provided AWS SAM template does not provide an Amazon Virtual Private Cloud (VPC) in which to host the resources. Integrate the resources into an appropriate networking configuration if you are using it in production applications.

The solution has auditability characteristics, as calls to the circuit breaker and to the microservices are logged to the Amazon CloudWatch log group. The audit log is encrypted using AWS Key Management Service.

To monitor the security of your account with the solution, use Amazon GuardDuty, AWS CloudTrail, AWS Config, and AWS WAF for API Gateway.

Conclusion

The circuit breaker pattern is a powerful tool for helping to ensure the resiliency and stability of serverless applications. Lambda extensions are a good fit for its implementation, as demonstrated in this example. With the provided Lambda extension and code, you can incorporate the circuit breaker pattern into your applications and customize it to suit your specific requirements, helping to ensure a robust and reliable system.

For more serverless learning resources, visit Serverless Land.