AWS Database Blog

Long-term storage and analysis of Amazon RDS events with Amazon S3 and Amazon Athena

Database administrators and operations teams often need to analyze historical events for compliance, troubleshooting, and operational insights. Amazon Relational Database Service (Amazon RDS) and Amazon Aurora MySQL-Compatible and PostgreSQL-Compatible Edition generate important events that provide insights into database operations, including backup status, configuration changes, security updates, and maintenance notifications. However, these events are only retained for 14 days. In this post, we show you how to implement an automated solution for archiving Amazon RDS events to Amazon Simple Storage Service (Amazon S3). We also discuss how to analyze the events with Amazon Athena which helps enable proactive database management, helps maintain security and compliance, and provides valuable insights for capacity planning and troubleshooting.

Solution overview

The solution uses AWS Lambda and Amazon EventBridge to automatically capture and archive Amazon RDS events to Amazon S3. It uses a tag-based approach to identify which RDS cluster/instances to monitor, making it flexible and scalable. It also provides an optional data analysis layer using Amazon Athena.

By archiving these events to Amazon S3, you can unlock numerous benefits:

  • Enhanced compliance capabilities
  • Comprehensive historical analysis
  • Improved troubleshooting
  • Robust audit trails

The above diagram displaying the solution follows the below steps:

  1. EventBridge triggers a Lambda function on a schedule
  2. The lambda functions uses the AWS SDK to query Amazon RDS for instances and clusters tagged with specific key-value pairs
  3. For each tagged cluster/instance, the function calls the Amazon RDS describe_events API to retrieve events from the previous 24 hours
  4. The function processes the events, formatting them as JSON documents with standardized fields including timestamp, source identifier, event type, and detailed messages
  5. Then, it writes the output to Amazon S3 following a partitioned structure (YYYY/MM/DD) for optimal query performance and cost efficiency. This partitioning strategy allows for efficient data lifecycle management and reduces the amount of data scanned during queries. For more information about partitioning best practices, see Partition your data using Amazon Athena
  6. To enable SQL-based analysis, the solution creates an external table in Athena that maps to the JSON structure in Amazon S3. This allows users to write standard SQL queries to analyze historical events, identify patterns, and generate reports.

Prerequisites

To implement the solution described in this post, you must have the following prerequisites:

Create a Lambda function

In this section you create the Lambda function that will be triggered periodically to upload the Amazon RDS events to an S3 bucket of your choice. Go to the AWS Management Console for AWS Lambda and choose Functions in the navigation pane.

  1. Choose Create function.
  2. For Function name¸ enter a name (for the example in this post, use RDSEventsExportToS3).
  3. For Runtime, choose Python 3.13.
  4. For Execution role select Use an existing role.
  5. For Existing role select the role you created as part of the prerequisites.
  6. Choose Create function at the bottom of the page.
  7. On the Lambda function details page, go to the function code section and replace the sample code with the following:
    import json
    import boto3
    from datetime import datetime, timezone
    import os
    from botocore.exceptions import ClientError
    
    def lambda_handler(event, context):
        rds = boto3.client('rds')
        s3 = boto3.client('s3')
        
        try:
            bucket_name = os.environ['S3_BUCKET']
            tag_key = os.environ['TAG_KEY']
            tag_value = os.environ['TAG_VALUE']
            duration = int(os.environ['DURATION'])  
        except KeyError as e:
            return {
                'statusCode': 500,
                'body': f'Missing required environment variable: {str(e)}'
            }
        
        current_time = datetime.now(timezone.utc)
    
        try:
            instances = get_tagged_instances(rds, tag_key, tag_value)
            clusters = get_tagged_clusters(rds, tag_key, tag_value)
            all_events = []
            
            for instance in instances:
                instance_id = instance['DBInstanceIdentifier']
                try:
                    events = get_instance_events(rds, instance_id, duration)
                    all_events.extend(events)
                except ClientError as e:
                    print(f'Error getting events for instance {instance_id}: {str(e)}')
                    continue
    
            for cluster in clusters:
                cluster_id = cluster['DBClusterIdentifier']
                try:
                    events = get_cluster_events(rds, cluster_id, duration)
                    all_events.extend(events)
                except ClientError as e:
                    print(f'Error getting events for cluster {cluster_id}: {str(e)}')
                    continue
            
            if not all_events:
                return {
                    'statusCode': 200,
                    'body': 'No events to export'
                }
            
            file_name = f"rds-events/{current_time.strftime('%Y/%m/%d')}/events_{current_time.strftime('%H%M%S')}.json"
            body = '\n'.join(all_events) + '\n'
            
            try:
                s3.put_object(
                    Bucket=bucket_name,
                    Key=file_name,
                    Body=body
                )
            except ClientError as e:
                return {
                    'statusCode': 500,
                    'body': f'Error uploading to S3: {str(e)}'
                }
            
            return {
                'statusCode': 200,
                'body': f'Successfully exported {len(all_events)} events to s3://{bucket_name}/{file_name}'
            }
        except Exception as e:
            return {
                'statusCode': 500,
                'body': f'Unexpected error: {str(e)}'
            }
    
    def get_tagged_instances(rds, tag_key, tag_value):
        instances = []
        try:
            paginator = rds.get_paginator('describe_db_instances')
            for page in paginator.paginate():
                for instance in page['DBInstances']:
                    if has_tag(instance, tag_key, tag_value):
                        instances.append(instance)
            return instances
        except ClientError as e:
            raise Exception(f'Error describing DB instances: {str(e)}')
    
    def get_tagged_clusters(rds, tag_key, tag_value):
        clusters = []
        try:
            paginator = rds.get_paginator('describe_db_clusters')
            for page in paginator.paginate():
                for cluster in page['DBClusters']:
                    if has_tag(cluster, tag_key, tag_value):
                        clusters.append(cluster)
            return clusters
        except ClientError as e:
            raise Exception(f'Error describing DB clusters: {str(e)}')
    
    def has_tag(resource, tag_key, tag_value):
        for tag in resource.get('TagList', []):
            if tag['Key'] == tag_key and tag['Value'] == tag_value:
                return True
        return False
    
    def get_instance_events(rds, instance_id, duration):
        try:
            response = rds.describe_events(
                SourceIdentifier=instance_id,
                SourceType='db-instance',
                Duration=duration
            )
            
            events = response['Events']
            formatted_events = []
            
            for event in events:
                formatted_event = {
                    'EventTime': event['Date'].isoformat(),
                    'SourceIdentifier': event['SourceIdentifier'],
                    'SourceType': event['SourceType'],
                    'Message': event['Message']
                }
                formatted_events.append(json.dumps(formatted_event, ensure_ascii=False))
            
            return formatted_events
        except ClientError as e:
            raise Exception(f'Error describing events for instance {instance_id}: {str(e)}')
    
    def get_cluster_events(rds, cluster_id, duration):
        try:
            response = rds.describe_events(
                SourceIdentifier=cluster_id,
                SourceType='db-cluster',
                Duration=duration
            )
            
            events = response['Events']
            formatted_events = []
            
            for event in events:
                formatted_event = {
                    'EventTime': event['Date'].isoformat(),
                    'SourceIdentifier': event['SourceIdentifier'],
                    'SourceType': event['SourceType'],
                    'Message': event['Message']
                }
                formatted_events.append(json.dumps(formatted_event, ensure_ascii=False))
            
            return formatted_events
        except ClientError as e:
            raise Exception(f'Error describing events for cluster {cluster_id}: {str(e)}')
    
  8. Choose Deploy to save your changes.
  9. Navigate to Configuration and then Environment variables.
  10. Configure the following environment variables:
    • S3_BUCKET = S3 bucket name.
    • TAG_KEY = archive
    • TAG_VALUE = true
    • DURATION = Amount of RDS Events to export in minutes (for example, 1440).

  11. Choose Save in the bottom right.
  12. Navigate to General configuration within the Configuration tab.
  13. Choose Edit and set the Lambda timeout to 10 seconds.
  14. Adjust the Memory allocation for the Lambda function. If the function doesn’t have enough memory, it may not be able to process all events successfully. You can monitor memory utilization through Amazon CloudWatch to determine the optimal configuration for your workload. Viewing your Lambda Insights metrics.
  15. Choose Save to save your changes.

Create an EventBridge schedule

Now you create an EventBridge schedule that will periodically trigger the Lambda function. For the example in this post, you will export Amazon RDS events daily, but you can change this if desired.

Note that if you change the schedule to another time interval, you must also change the DURATION environmental variable within the Lambda function to reflect this interval. You can also change the rate expression to run the Lambda function more frequently – this can help reduce the Lambda functions memory consumption.

  1. In the EventBridge console, choose Schedules in the navigation pane.
  2. Choose Create schedule.
  3. For Name, enter a name (for the example in this post, use RDSEventsExportSchedule).
  4. For Occurrence, select Recurring schedule.
  5. For Time zone, select UTC.
  6. For Schedule type, select Rate-based schedule.
  7. For Rate expression, enter 1 day as the unit.
  8. For Flexible Time Window, select Off.
  9. Choose Next.
  10. For Target API, select Templated targets.
  11. Select AWS Lambda Invoke.
  12. Select the AWS Lambda function RDSEventsExportToS3 from the dropdown and choose Next.
  13. Review your configuration and choose Create schedule.

Tagging RDS instances and/or Aurora Clusters

Now, you have a solution that exports Amazon RDS events daily to Amazon S3 for any RDS instance or Aurora cluster that’s tagged with archive=True. You can tag your instances and clusters to start exporting their Amazon RDS events to S3 using the following steps:

  1. Navigate to the Aurora and RDS Dashboard.
  2. Choose Databases on the navigation pane.
  3. Select the instance you want to monitor.
  4. Choose Tags and then Manage Tags.
  5. Choose Add new tag.
  6. For key, enter archive and for value, enter true.
  7. Choose Save changes.

The environment variables within the Lambda function control what the key and value combination must be when tagging your RDS instance. If you want to change the key or value, you can edit the Lambda function’s environment variables (TAG_KEY and TAG_VALUE).

Up to this point, you have configured a fully working solution to store Amazon RDS events within Amazon S3 once a day. Remember that you can change the frequency of the Lambda function’s execution, however you must alter the rate expression shown within step 7 of the EventBridge creation and also the duration variable in the Lambda code. If you want to build an analytical solution for the data within Amazon S3, continue to the next section, if not you can skip to the Clean up and Limitations sections.

Set up Athena

By using Lambda, EventBridge, and Amazon S3, you’ve created a serverless architecture that automatically captures and archives Amazon RDS events.

Now that you have Amazon RDS event data stored as JSON files on Amazon S3, you can use Athena to query the data. If you don’t want to add this analytical component, you can skip to the next section.

  1. Open the Amazon Athena console.
  2. Choose Query Editor in the navigation pane.
  3. If you haven’t used Athena before you must configure an S3 bucket to store query results. To learn more, see Get started in the Athena User Guide.
  4. Create a new database:
    CREATE DATABASE IF NOT EXISTS rds_events_db;
  5. Create an external table for the Amazon RDS events (note this can be customized depending on use case).
    CREATE EXTERNAL TABLE rds_events_db.rds_events (
    EventTime string,
    SourceIdentifier string,
    SourceType string,
    Message string
    )
    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    WITH SERDEPROPERTIES (
    'ignore.malformed.json' = 'true'
    )
    LOCATION ;

Now, you can query the Amazon RDS events depending on the data you want to gather. You can monitor backup compliance, track configuration changes across your Amazon RDS fleet, identify security-related modifications, and detect patterns in maintenance events. For example, the following query gathers the Amazon RDS events where a backup has started:

SELECT 
    EventTime,
    SourceIdentifier,
    SourceType,
    Message
FROM rds_events_db.rds_events
WHERE Message LIKE '%Backing up DB instance%'
ORDER BY EventTime DESC;

For a full list of the RDS instance and cluster events please visit AWS documentation.

Clean up

To avoid incurring future charges, use the following steps to delete the resources you created if you no longer need them.

  1. Delete the EventBridge schedule.
  2. Delete the Lambda function.
  3. Remove Athena resources (if configured):
    1. Open the Athena console
    2. Execute the following SQL commands:
      DROP TABLE IF EXISTS rds_events_db.rds_events;
      DROP DATABASE IF EXISTS rds_events_db;
    3. Empty the Athena query results S3 bucket if no longer needed.
  4. Delete the Amazon S3 bucket contents and bucket.
  5. Delete IAM resources.
  6. Remove tags from RDS instances (optional) if the archive=true tag is no longer needed.

Make sure to delete resources in all AWS Regions where you deployed this solution.

Limitations and Next Steps

This solution has some limitations that you need to consider before implementing it in production:

  • This solution doesn’t work across multiple AWS accounts.
  • This solution doesn’t automatically purge old data in Amazon S3. To automatically purge old data, consider implementing S3 Lifecycle.
  • Newly created RDS instances aren’t automatically tagged. Review the Tagging RDS instances section for more details on how to tag RDS instances.

You can customize your Lambda function to address specific needs, such as filtering for specific RDS event types. This will allow for more targeted monitoring and response. Additionally, using Athena to create views of your data enables more sophisticated querying and analysis of your RDS events, providing deeper insights into your database performance and health. Also consider implementing business intelligence tools to further analyze the collected data, helping you identify trends, patterns, and potential issues before they become critical. Finally, integrate your RDS event data with Amazon QuickSight. You can use it to transform your raw data into intuitive, interactive dashboards. By implementing these enhancements, you’ll create a more robust, insightful, and actionable RDS monitoring system.

Conclusion

In this post, we’ve presented a solution that you can use to archive Amazon RDS events to Amazon S3 for long-term storage and analysis. We invite you to implement this solution in your own environment and explore its potential.


About the authors

Uttirna Datta

Uttirna Datta

Uttirna is a Technical Account Manager with Amazon Web Services, specializing in database technologies and cloud operations bringing over 11 years of industry experience to the role. As a subject matter expert in Cloud Governance and Storage solutions, Uttirna guides customers in optimizing their cloud environments while ensuring operational excellence and cost efficiency. She drives customer success by reducing complexity and streamlining solutions, ensuring optimal use of AWS technologies while delivering concrete business value.

Ryan Moore

Ryan Moore

Ryan is a Technical Account Manager who has worked within the AWS database team for 2 years. Ryan is an Aurora MySQL and RDS MySQL subject matter expert that specializes in enabling customers to build performant, scalable, and secure architectures within the AWS Cloud.