How do I create an AWS Lambda function that runs queries in Amazon Redshift?

Last updated: 2022-03-11

I'm trying to create an AWS Lambda function that runs queries on Amazon Redshift. How can I do this?

Resolution

Prerequisites

Before you create a Lambda function, you must set up the following VPC endpoints:

1.    Create a VPC with a private subnet.

2.    Create a subnet group. Add the VPC you just created.

3.    Create a private Amazon Redshift cluster selecting the VPC and subnet group that you just created.

4.    Create a new secret for Amazon Redshift with AWS Secrets Manager. Name your secret "redshift".

Create your Lambda function

To create a Lambda function that queries your Amazon Redshift cluster, perform the following steps:

1.    Open the Lambda console.

2.    Choose Create function.

3.    Choose the Author from Scratch option.

4.    Update the following fields:
Function name: Enter a custom name.
Runtime: Enter your code environment. (The examples from this article are compatible with "Python 3.9".)
Architecture: Enter your system architecture. (The examples from this article are compatible with "x86_64".)
Permissions: Choose Create a new role with basic Lambda permissions.

5.    Choose Create function.

Set the correct permissions for your Lambda function

1.    In the Lambda console, choose Configuration.

2.    Choose Permissions.

3.    Choose the role created for your Lambda function.

4.    Choose Add Permissions.

5.    Choose Attach policies.

6.    Add the "AmazonRedshiftDataFullAccess" and "SecretsManagerReadWrite" policies to your Lambda execution role.

Add Python code to your Lambda function

1.    In the Lambda console, choose Code.

2.    Paste the following code into the Code box:

import json
import urllib.parse
import boto3
import botocore.session as bc

print('Loading function')

s3 = boto3.client('s3')


def lambda_handler(event, context):
    print("Entered lambda_handler")

    secret_name='redshift' ## HERE add the secret name created.
    session = boto3.session.Session()
    region = session.region_name
    
    client = session.client(
            service_name='secretsmanager',
            region_name=region
        )
    
    get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    secret_arn=get_secret_value_response['ARN']
    
    secret = get_secret_value_response['SecretString']
    
    secret_json = json.loads(secret)
    
    cluster_id=secret_json['dbClusterIdentifier']
    
    
    bc_session = bc.get_session()
    
    session = boto3.Session(
            botocore_session=bc_session,
            region_name=region,
        )
    
    # Setup the client
    client_redshift = session.client("redshift-data")
    print("Data API client successfully loaded")
    
    query_str = "create table public.lambda_func (id int);"
                      
    print(query_str)
    
    res = client_redshift.execute_statement(Database= 'dev', SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id)
    id=res["Id"]

Note: Make sure to replace secret_name with your secret name in case that you created a secret with name different than "redshift".

In this example, Lambda connects to the Amazon Redshift database and creates a lambda_func table in the public schema.


Did this article help?


Do you need billing or technical support?