AWS Database Blog
Schedule jobs in Amazon RDS for PostgreSQL using AWS CodeBuild and Amazon EventBridge
When you want to migrate on-premises database workloads with jobs to AWS, you need to select the right AWS services to schedule the jobs. Database administrators traditionally schedule scripts to run against databases using the system cron
on the host where the database is running. When you migrate such workloads from on premises to a managed database service like Amazon Relational Database Service (Amazon RDS), you lose the ability to log in to the host instance to schedule cron
jobs.
Newer releases of Amazon RDS for PostgreSQL starting with version 12.5 enable you to schedule PostgreSQL commands within your database. However, if you need to orchestrate the Amazon RDS jobs and integrate with native AWS services such as Amazon Simple Notification Service (Amazon SNS) to send notifications, this post provides an alternate way to schedule and run Amazon RDS jobs using AWS CodeBuild and Amazon EventBridge rules.
Why AWS CodeBuild?
An Amazon RDS job can run stored procedures and functions, extract the data, purge the data, or do any other SQL operations. You can run these jobs with different AWS services, such as AWS Lambda, AWS Batch, Amazon Elastic Container Service (Amazon ECS), and AWS Step Functions. As your data grows, jobs might take more than 15 minutes to run. CodeBuild can natively handle workloads that run for a longer time, in comparison to services like Lambda, which are designed to finish within 15 minutes.
If you want to use AWS Batch or Amazon ECS, you need to package your code as a Docker container. CodeBuild is a good option when your job takes more than 15 minutes or you can’t (or don’t want to) package your code as a Docker container, or you want to use different programming language runtimes for your jobs.
This post demonstrates how to use the combination of CodeBuild and EventBridge rules to schedule and run functions or stored procedures on an RDS for PostgreSQL database instance.
Overview of solution
The following diagram illustrates the architecture of the solution.
The workflow includes the following steps:
- EventBridge triggers a CodeBuild project environment based on your schedule.
- CodeBuild retrieves the source code from an Amazon Simple Storage Service (Amazon S3) bucket.
- CodeBuild retrieves the database credentials from AWS Secrets Manager.
- CodeBuild initiates the database connection and runs the PostgreSQL stored procedure or function.
- CodeBuild events are published to EventBridge.
- Amazon SNS notifies subscribed users of the job status.
Prerequisites
Before you begin, you need to complete the following prerequisites:
- Create or have access to an AWS account.
- Ensure git is installed on your machine.
- Set up and configure the AWS Command Line Interface (AWS CLI). For instructions, see Installing, updating, and uninstalling the AWS CLI.
- Have a SQL client to connect to the RDS database. In this post, I use Dbeaver.
- Have an email address to receive Amazon SNS notifications.
Walkthrough overview
The following steps provide a high-level overview of the walkthrough:
- Clone the project from the AWS code samples repository.
- Deploy the AWS CloudFormation template to create the required services.
- Upload the source code to the S3 bucket.
- Run database scripts and create the SQL function.
- Run the CodeBuild project manually.
- Verify if the batch job is running successfully based on the EventBridge rule.
Clone the source code
Download the files required to set up the environment. See the following code:
You use the following files:
- CreateFunct.sql – Has a code to create a sample SQL function. The CodeBuild project is configured to run this SQL function.
- Jobschedulingcft.yml – Defines all the AWS resources required for this solution.
- invokepostgresqldbpy.zip – Contains buildspec.yml and a Python script. CodeBuild installs the libraries such as boto3 and psycopg2 defined in the buildspec.yml and invokes the Python script, which has a code to connect to PostgreSQL database and run the SQL function.
Deploying the CloudFormation template
To deploy the CloudFormation template, complete the following steps:
- Update the email address parameter in the CloudFormation template Jobschedulingcft.yml. Amazon SNS uses this email address to send a notification about the job status.
- Run the CloudFormation template to provision the required services. See the following code for macOS or Linux:
You receive the following output:
The template creates the following resources:
- A CodeBuild project
- A PostgreSQL instance
- An S3 bucket
- Secrets Manager with PostgreSQL database login credentials
- An EventBridge rule to trigger the job every day at 10:00 AM (UTC)
- AWS Identity and Access Management (IAM) service roles for CodeBuild, EventBridge, and Amazon SNS
- An SNS topic to send notifications to the provided email address with the status of the job
Upload the source code to the S3 bucket
On the AWS CloudFormation console, choose the Outputs tab. Make a note of the S3 bucket name.
Upload aws-codebuild-rds-job-scheduling/src/invokepostgresqldbpy.zip
to the S3 bucket with the following CLI command:
Running database scripts
In this post, I create a job that runs a simple PostgreSQL function.
To retrieve PostgeSQL database login credentials from Secrets Manager and create the PostgreSQL function, complete the following steps:
- On the Secrets Manager console, choose the secret name (stack name-secret).
- Choose Retrieve secret value.
- Record the PostgreSQL credentials.
- Connect to the PostgreSQL database that the CloudFormation template provisioned.
- Run the following SQL script to create the PostgreSQL function.
The function concat_lower_or_upper
has two mandatory parameters, a
and b
. It also has an optional parameter, uppercase
, which defaults to false
. The a
and b
inputs are concatenated, and forced to either upper or lowercase depending on the uppercase
parameter.
Run the CodeBuild Batch job manually
For on-demand jobs, complete the following steps to run the CodeBuild batch job manually:
- On the CodeBuild console, in the navigation pane, under Build projects, choose Build project.
- Choose the project and choose Start build.
- On the Phase details tab, verify the status.
The CodeBuild job is configured to run the following PostgreSQL function in Python script. This function concatenates two strings, ‘hello’
and ‘world’
, converts to an uppercase, and returns ‘HELLO WORLD’
:
You can verify the job’s output in Amazon CloudWatch Logs, which CodeBuild generated.
CloudWatch also sends the SNS notification to the email address configured in the CloudFormation template.
Verify the scheduled run of CodeBuild jobs
The CodeBuild job is scheduled using EventBridge cron
scheduler. The CloudFormation template created an EventBridge rule to run this job every day at 10:00 AM (UTC). You can change the schedule by editing the rule.
- On the EventBridge console, under Events, choose Rules.
- Select your rule and choose Edit.
- Based on your requirement, change the schedule by updating the CRON expression.
- Verify the CodeBuild ARN and role exist.
- Choose Update.
Clean up
To avoid incurring future changes, clean up the resources you created.
- Delete the S3 objects:
- Delete the CloudFormation stack:
Alternatively, delete the stack on the AWS CloudFormation console. To troubleshoot if stack deletion fails, see Delete stack fails.
Conclusion
This post demonstrated how to use CodeBuild and EventBridge to schedule and run jobs on a PostgreSQL database. You can run jobs on any RDS database with the same solution by including the compatible Python adapter in your Python script.
Additionally, this solution helps you to orchestrate the job workflows by integrating CodeBuild with Step Functions for more advanced dependency management.
If you have any questions or suggestions about this post, feel free to leave a comment.
About the author
Suresh Moolya is a Cloud Application Architect with Amazon Web Services. He works with customers to architect, design, and automate business softwares at scale on AWS cloud.