AWS Compute Blog

Triggering AWS Lambda function from a cross-account Amazon Managed Streaming for Apache Kafka

This post is written by Subham Rakshit, Senior Specialist Solutions Architect, and Ismail Makhlouf, Senior Specialist Solutions Architect.

Many organizations use a multi-account strategy for stream processing applications. This involves decomposing the overall architecture into a single producer account and many consumer accounts. Within AWS, in the producer account, you can use Amazon Managed Streaming for Apache Kafka (Amazon MSK), and in their consumer accounts have AWS Lambda functions for event consumption. This blog post explains how you can trigger Lambda functions from a cross-account Amazon MSK cluster.

The Lambda event sourcing mapping (ESM) for Amazon MSK continuously polls for new events from the Amazon MSK cluster, aggregates them into batches, and then triggers the target Lambda function. The ESM for Amazon MSK functions as a serverless set of Kafka consumers that ensures that each event is processed at least once. Additionally, events are processed in the same order they are received within each Kafka partition. In addition, the ESM batches the stream of data and filters the events based on configured logic.

Overview

Amazon MSK supports two different deployment types: provisioned and serverless. Triggering a Lambda function from a cross-account Amazon MSK cluster is only supported with a provisioned cluster deployed within the same Region. To facilitate this functionality, Amazon MSK uses multi-VPC private connectivity, powered by AWS PrivateLink, which simplifies connecting Kafka consumers hosted in different AWS accounts to an Amazon MSK cluster.

The following diagram illustrates the architecture of this example:

Architecture Diagram

The architecture is divided in two parts: the producer and the consumer.

In the producer account, you have the Amazon MSK cluster with multi-VPC connectivity enabled. Multi-VPC connectivity is only available for authenticated Amazon MSK clusters. Cluster policies are required to grant permissions to other AWS accounts, allowing them to establish private connectivity to the Amazon MSK cluster. You can delegate permissions to relevant roles or users. When combined with AWS Identity and Access Management (IAM) client authentication, cluster policies offer fine-grained control over Kafka data plane permissions for connecting applications.

In the consumer account, you have the Lambda ESM for Amazon MSK and the managed VPC connection deployed within the same VPC. The managed VPC connection allows private connectivity from the consumer application VPC to the Amazon MSK cluster. The Lambda ESM for Amazon MSK connects to the cross-account Amazon MSK cluster via IAM authentication. It also supports SASL/SCRAM, and mutual TLS (mTLS) authenticated clusters. The ESM receives the event from the Kafka topic and invokes the Lambda function to process it.

Deploying the example application

To set up the Lambda function trigger from a cross-account Amazon MSK cluster as the event source, follow these steps. The AWS CloudFormation templates for deploying the example are accessible in the GitHub repository.

As a part of this example, some sample data is published using the Kafka console producer and Lambda processes these events and writes to Amazon S3.

Pre-requisites

For this example, you need two AWS accounts. This post uses the following naming conventions:

  • Producer (for example, account No: 1111 1111 1111): Account that hosts the Amazon MSK cluster and Kafka client instance.
  • Consumer (for example, account No: 2222 2222 2222): Account that hosts the Lambda function and consumes events from Amazon MSK.

To get started:

  1. Clone the repository locally:
    git clone https://github.com/aws-samples/lambda-cross-account-msk.git
  2. Set up the producer account: you must configure the VPC networking, deploy the Amazon MSK cluster, and a Kafka client instance to publish data. To do this, deploy the CloudFormation template producer-account.yaml from the AWS console and take note of the MSKClusterARN from the CloudFormation outputs tab.
  3. Set up the consumer account: To set up the consumer account, you need the Lambda function, IAM role used by the Lambda function, and S3 bucket receiving the data. For this, deploy the CloudFormation template consumer-account.yaml from the AWS console with the input parameter MSKAccountId, that is the producer AWS account ID (for example, account Id: 1111 1111 1111). Note the LambdaRoleArn from the CloudFormation outputs tab.

Setting up multi-VPC connectivity in the Amazon MSK cluster

Once the accounts are created, you must enable connectivity between them. By enabling multi-VPC private connectivity in the Amazon MSK cluster, you set up the network connection to allow the cross-account consumers to connect to the cluster.

  1. In the producer account, navigate to the Amazon MSK console.
  2. Choose producer-cluster, and go to the Properties tab.
  3. Scroll to Networking settings, choose Edit, and select Turn on multi-VPC connectivity. This takes some time, then appears as follows.Networking settings
  4. Add the necessary cluster policy to allow cross-account consumers to connect to Amazon MSK. In the producer account, deploy the CloudFormation template producer-msk-cluster-policy.yaml from the AWS console with the following input parameters:
    • MSKClusterArnAmazon Resource Name (ARN) of the Amazon MSK cluster in producer account. Find this information in the CloudFormation output of producer-account.yaml.
    • LambdaRoleArn – ARN of the IAM role attached to the Lambda function in the consumer account. Find this information in the CloudFormation output of consumer-account.yaml.
    • LambdaAccountId – Consumer AWS account ID (for example, account Id: 2222 2222 2222).

Creating a Kafka topic in Amazon MSK and publishing events

In the producer account, navigate to the Amazon MSK console. Choose the Amazon MSK cluster named producer-cluster. Choose View client information to show the bootstrap server.

Client information

The CloudFormation template also deploys a Kafka client instance to create topics and publish events.

To access the client, go to the Amazon EC2 console and choose the instance producer-KafkaClientInstance1. Connect to EC2 instance with Session Manager:

sudo su - ec2-user
#Set MSK Broker IAM endpoint
export BS=<<Provide IAM bootstrap address here>>

You must use the single-VPC Private endpoint for the Amazon MSK cluster and not the multi-VPC private endpoint, as you are going to publish events from a Kafka console producer from the producer account.

Run these scripts to create the customer topic and publish sample events in the topic:

./kafka_create_topic.sh
./kafka_produce_events.sh

Creating a managed VPC connection in the consumer account

To establish a connection to the Amazon MSK cluster in the producer account, you must create a managed VPC connection in the consumer account. Lambda communicates with cross-account Amazon MSK through this managed VPC connection.

For detailed setup steps, read the Amazon MSK managed VPC connection documentation.

Configuring the Lambda ESM for Amazon MSK

The final step is to set up the Lambda ESM for Amazon MSK. Setting up the ESM enables you to connect to the Amazon MSK cluster in the producer account via the managed VPC endpoint. This allows you to trigger the Lambda function to process the data produced from the Kafka topic:

  1. In the consumer account, go to the Lambda console.
  2. Open the Lambda function msk-lambda-cross-account-iam.
  3. Go to the Configuration tab, select Triggers, and choose Add Trigger.
  4. For Trigger configuration, select Amazon MSK.

Lambda trigger

To configure this trigger:

  1. Select the shared Amazon MSK cluster. This automatically defaults to the IAM authentication that is used to connect to the cluster.
    MSK Lambda trigger
  2. By default, the Active trigger check box is enabled. This ensures that the trigger is in the active state after creation. For the other values:
    1. Keep the Batch size default to 100.
    2. Change the Starting Position to Trim horizon.
    3. Set the Topic name as customer.
    4. Set the Consumer Group ID as msk-lambda-iam.

Trigger configuration

Scroll to the bottom and choose Add. This starts creating the Amazon MSK trigger, which takes several minutes. After creation, the state of the trigger shows as Enabled.

Verifying the output on the consumer side

The Lambda function receives the events and writes them in an S3 bucket.

To validate that the function is working, go to the consumer account and navigate to the S3 console. Search for the cross-account-lambda-consumer-data-<<REGION>>-<<AWS Account Id>> bucket. In the bucket, you see the customer-data-<<datetime>>.csv files.

S3 bucket objects

Cleaning up

You must empty and delete the S3 bucket, managed VPC connection, and the Lambda ESM for Amazon MSK manually from the consumer account. Next, delete the CloudFormation stacks from the AWS console from both the producer and consumer accounts to remove all other resources created as a part of the example.

Conclusion

With Lambda and Amazon MSK, you can now build a decentralized application distributed across multiple AWS accounts. This post shows how you can set up Amazon MSK as an event source for cross-account Lambda functions and also walks you through the configuration required in both producer and consumer accounts.

For further reading on AWS Lambda with Amazon MSK as an event source, visit the documentation.

For more serverless learning resources, visit Serverless Land.