Developing portable AWS Lambda functions

This blog post is written by Uri Segev, Principal Serverless Specialist Solutions Architect

When developing new applications or modernizing existing ones, you might face a dilemma: which compute technology to use? A serverless compute service such as AWS Lambda or maybe containers? Often, serverless can be the better approach thanks to automatic scaling, built-in high availability, and a pay-for-use billing model. However, you may hesitate to choose serverless for reasons such as:

Perceived higher cost or difficulty in estimating cost
It is a paradigm shift, which requires learning to bridge the knowledge gap
Misconceptions about Lambda capabilities and use cases
Concern that using Lambda will result in lock-in
Existing investments in non-serverless platforms and tooling

This blog post suggests best practices for developing portable Lambda functions that allow you to easily port your code to containers if you later choose to. By doing so, you can avoid lock-in and try out the serverless approach in a risk-free way.

Each section of this blog post describes what you need to consider when writing portable code and the steps needed to migrate this code from Lambda to containers, if you later choose to do so.

Best practices for portable Lambda functions

Separate business logic and Lambda handler

Lambda functions are event-driven in nature. When a specific event happens, it invokes the Lambda function by calling its handler method. The handler method receives an event object which contains information regarding the reason for the function invocation. Once the function execution completes, it returns from the handler method. Whatever is returned from the handler is the function’s return value.

To write portable code, we recommend using the handler method only as an interface between the Lambda runtime (event object) and the business logic. Using Hexagonal architecture terminology, the handler should be a driving adapter making calls into the port, which is the interface exposed by the business logic The handler should extract all required information from the event object and then call a separate method that implements the business logic.

When that method returns, the handler constructs the result in the format expected by the function invoker and returns it. We also recommend splitting the handler code and the business logic code into separate files. Should you choose to migrate to containers later, you simply migrate your business logic code files with no additional changes.

The following pseudocode shows a Lambda handler that extracts information from the event object and calls the business logic. Once the business logic is done, the handler places the response in the function’s return value:

import business_logic

# The Lambda handler extracts needed information from the event
# object and invokes the business logic
handler(event, context) {
  # Extract needed information from event object payload = event[‘payload’]

  # Invoke business logic
  result = do_some_logic(payload)
  
  # Construct result for API Gateway
  return {
    statusCode: 200,
	body: result
  }
}

The following pseudocode shows the business logic. It’s located in a separate file and is unaware that it is being invoked from a Lambda function. It is pure logic.

# This is the business logic. It knows nothing about who invokes it.
do_some_logic(data) {
result = "This is my result."
  return result
}

This approach also makes it easier to run unit tests on the business logic without the need to construct event objects and to invoke the Lambda handler.

If you migrate to containers later, you include the business logic files in your container with new interface code as described in the following section.

Event source integration

One benefit of Lambda functions is the event source integration. For instance, if you integrate Lambda with Amazon Simple Queue Service (Amazon SQS), the Lambda service will take care of polling the queue, invoking the Lambda function and deleting the messages from the queue when done. By using this integration, you need to write less boilerplate code. You can focus only on implementing business logic and not the integration with the event source.

The following pseudocode shows how the Lambda handler looks like for an SQS event source:

import business_logic

handler(event, context) {
  entries = []
  # Iterate over all the messages in the event object
  for message in event[‘Records’] {
    # Call the business logic to process a single message
    success = handle_message(message)

    # Start building the response
    if Not success {
      entries.append({
      'itemIdentifier': message['messageId']
      })
    }
  }

  # Notify Lambda about failed items.
  if (let(entries) > 0) {
    return {
      'batchItemFailures': entries
    }
  }
}

As you can see in the previous code, the Lambda function has almost no knowledge that it is being invoked from SQS. There are no SQS API calls. It only knows the structure of the event object, which is specific to SQS.

When moving to a container, the integration responsibility moves from the Lambda service to you, the developer. There are different event sources in AWS, and each of them will require a different approach for consuming events and invoking business logic. For example, if the event source is Amazon API Gateway, your application will need to create an HTTP server that listens on an HTTP port and waits for incoming requests in order to invoke the business logic.

If the event source is Amazon Kinesis Data Streams, your application will need to run a poller that reads records from the shards, keep track of processed records, handle the case of a change in the number of shards in the stream, retry on errors, and more. Regardless of the event source, if you follow the previous recommendations, you will not need to change anything in the business logic code.

The following pseudocode shows how the integration with SQS will look like in a container. Note that you will lose some features such as batching, filtering, and, of course, automatic scaling.

import aws_sdk
import business_logic

QUEUE_URL = os.environ['QUEUE_URL']
BATCH_SIZE = os.environ.get('BATCH_SIZE', 1)
sqs_client = aws_sdk.client('sqs')

main() {
  # Infinite loop to poll for messages from SQS
  while True {

    # Receive a batch of messages from the queue
    response = sqs_client.receive_message(
      QueueUrl = QUEUE_URL,
      MaxNumberOfMessages = BATCH_SIZE,
      WaitTimeSeconds = 20 )

    # Loop over the messages in the batch
    entries = []
    i = 1
    for message in response.get('Messages',[]) {
      # Process a single message
      success = handle_message(message)

      # Append the message handle to an array that is later
      # used to delete processed messages
      if success {
        entries.append(
          {
            'Id': f'index{i}',
            'ReceiptHandle': message['receiptHandle']
          }
        )
        i += 1
      }
    }

    # Delete all the processed messages
    if (len(entries) > 0) {
      sqs_client.delete_message_batch(
        QueueUrl = QUEUE_URL,
        Entries = entries
      )
    }
  }
}

Another point to consider here is Lambda destinations. If your function is invoked asynchronously and you configured a destination for your function, you will need to include that in the interface code. It will need to catch any business logic error and, based on that, invoke the right destination.

Package functions as containers

Lambda supports packaging functions as .zip files and container images. To develop portable code, we recommend using container images as your default packaging method. Even though you package the function as a container image, you can’t run it on other container platforms such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (EKS). However, by packaging it this way, the migration to containers later will be easier as you are already using the same tools and you already created a Dockerfile that will require minimal changes.

An example Dockerfile for Lambda looks like this:

FROM public.ecr.aws/lambda/python:3.9
COPY *.py requirements.txt ./
RUN python3.9 -m pip install -r requirements.txt -t .
CMD ["app.lambda_handler"]

If you move to containers later, you will need to change the Dockerfile to use a different base image and adapt the CMD line that defines how to start the application. This is in addition to the code changes described in the previous section.

The corresponding Dockerfile for the container will look like this:

FROM python:3.9
COPY *.py requirements.txt ./
RUN python3.9 -m pip install -r requirements.txt -t .
CMD ["python", "./app.py"]

The deployment pipeline also needs to change as we deploy to a different target. However, building the artifacts remains the same.

Single invocation per instance

Lambda functions run in their own isolated runtime environment. Each environment handles a single request at a time which works great for Lambda. However, if you migrate your application to containers, you will likely invoke the business logic from multiple threads in a single process at the same time.

This section discusses aspects of moving from a single invocation to multiple concurrent invocations within the same process.

Static variables

Static variables are those that are instantiated once and then reused across multiple invocations. Examples of such variables are database connections or configuration information.

For function optimization, and specifically for reducing cold starts and the duration of warm function invocations, we recommend initializing all static variables outside the function handler and storing them in global variables so that further invocations will reuse them.

We recommend using an initialization function that you write as part of the business logic module and that you invoke from outside the handler. This function saves information in global variables that the business logic code reuses across invocations.

The following pseudocode shows the Lambda function:

import business_logic

# Call the initialization code
initialize()

handler(event, context) {
  ...
  # Call the business logic
  ...
}

And the business logic code will look like this:

# Global variables used to store static data
var config

initialize() {
  config = read_Config()
}

do_some_logic(data) {
  # Do something with config object
  ...
}

The same also applies to containers. You will usually initialize static variables when the process starts and not for every single request. When moving to containers, all you need to do is call the initialization function before starting the main application loop.

import business_logic

# Call the initialization code
initialize()

main() {
  while True {
    ...
    # Call the business logic
    ...
  }
}

As you can see, there are no changes in the business logic code.

Database connections

As Lambda functions share nothing between the runtime environments, unlike containers they can’t rely on connection pools when connecting to a relational database. For this reason, we created Amazon RDS Proxy, which acts as a centralized connection pool used by many functions.

To write portable Lambda functions, we recommend using a connection pool object with a single connection. Your business logic code will always ask for a connection from the pool when making a database request. You will still need to use RDS Proxy.

If you later move to containers, you can increase the number of connections in the pool to a larger number with no further changes and the application will scale without overwhelming the database.

File system

Lambda functions come with a writable /tmp folder in the size of 512 MB to 10 GB. As each function instance runs in an isolated runtime environment, developers usually use fixed file names for files stored in that folder. If you run the same business logic code in a container in multiple threads, the different threads will overwrite the files created by others.

We recommended using unique file names in each invocation. Append a UUID or another random number to the file name. Delete the files once you are done with them to avoid running out of space.

If you move your code to containers later, there is nothing to do.

Portable web applications

If you develop a web application, there is another way to achieve portability. You can use the AWS Lambda Web Adapter project to host a web app inside a Lambda function. This way you can develop a web application with familiar frameworks (e.g., Express.js, Next.js, Flask, Spring Boot, Laravel, or anything that uses HTTP 1.1/1.0), and run it on Lambda. If you package your web application as a container, the same Docker image can run on Lambda (using the web adapter) and containers.

Porting from containers to Lambda

This blog post demonstrates how to develop portable Lambda functions you can easily port to containers. Taking these recommendations into consideration can also help develop portable code in general, which allows you to port containers to Lambda functions.

Some things to consider:

Separate the business logic from the interface code in the container. The interface code should interact with the event sources and invoke the business logic.
As Lambda functions only have a /tmp writable folder, replicate this in your containers (even though you could write to different locations).

Conclusion

This blog post suggests best practices for developing Lambda functions that allow you to gain the benefits of a serverless approach without risking lock-in.

By following these best practices for separating business logic from Lambda handlers, packaging functions as containers, handling Lambda’s single invocation per instance, and more, you can develop portable Lambda functions. As a consequence, you will be able to port your code from Lambda to containers with minimal effort if you choose to move to containers later.

Refer to these best practices and code samples to ease the adoption of a serverless approach when developing your next application.

For more serverless learning resources, visit Serverless Land.

AWS Compute Blog