AWS Database Blog

Monitor Amazon RDS for PostgreSQL and Amazon Aurora for PostgreSQL database log errors and set up notifications using Amazon CloudWatch

Database administrators set up monitoring on database log files to get alerted on certain informational and critical events relating to a pattern of errors specific to a database. Monitoring for errors on a business-critical database is essential to avoid unexpected outcomes such as a missed service-level agreement (SLA), which might result in penalties. A good understanding of a normal pattern of errors in your environment will help in filtering irrelevant errors and catching relevant errors for notification. Based on the severity of the events, either manual or automatic intervention might be needed to avoid issues.

You can use a variety of tools to monitor databases on AWS, such as Amazon RDS Performance Insights, Amazon CloudWatch, and Amazon Relational Database Service (Amazon RDS) database log files. These tools let you monitor the load on the database, host-level metrics, and instance activity and errors respectively.

In this post, we walk through the steps to set up alerts on a PostgreSQL log file using keywords or metrics filters in Amazon CloudWatch Logs. After filtering, we use Amazon Simple Notification Service (Amazon SNS) to be notified of database events and errors that need attention.

Solution overview

Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL are two managed database offerings from AWS that support the PostgreSQL database engine. It takes only a few clicks on the AWS Management Console to launch and connect to a PostgreSQL database. Amazon RDS and Aurora instances are preconfigured with parameters and settings for the server type you choose. Both services can integrate with CloudWatch to provide metrics for instances, and both provide an option to export the PostgreSQL log to publish to CloudWatch Logs.

With CloudWatch Logs, you can view metrics, analyze log data, and create real-time alarms. For example, if the log_statements parameter is set to ddl, you can set up an alarm to alert you whenever a DDL statement is run.

To work with CloudWatch Logs, we configure an Amazon RDS or Aurora instance to publish log data to a log group.

Publishing log files to CloudWatch Logs is supported only for PostgreSQL versions 9.6.6 and later and 10.4 and later. You can publish the following log types to CloudWatch Logs for Amazon RDS for PostgreSQL:

  • PostgreSQL log
  • Upgrade log (not available for Aurora PostgreSQL)

To set up monitoring and alerting against specific database events, you must set the parameter log_error_verbosity to VERBOSE in the Amazon RDS or Aurora instance’s parameter group. Be aware that this setting could affect performance if the database is already producing a lot of errors. Setting this value includes additional SQLSTATE code in the PostgreSQL log, and you can create metric filters for specific keywords. For more information on changing parameters in a parameter group for an RDS or Aurora instance, see Modifying parameters in a DB parameter group.

Follow enterprise security best practices to allow access across AWS resources and services. Use the principle of least privilege to avoid any unintended access. For more information, see Use roles to delegate permissions.

The following diagram illustrates the architecture of our solution, which sets up monitoring of Amazon RDS or Aurora database log errors and notifications using CloudWatch.

To implement this solution, we complete the following high-level steps:

  1. Publish the PostgreSQL log to CloudWatch Logs.
  2. Create an SNS topic.
  3. Create an SNS subscription.
  4. Create a metric filter on the log group in CloudWatch Logs.
  5. Create an alarm in CloudWatch.

Publish a PostgreSQL log to Amazon CloudWatch Logs

You can set up PostgreSQL log monitoring when you create a new Amazon RDS or Aurora for PostgreSQL database.

  1. On the Amazon RDS console, choose Databases in the navigation pane.
  2. Choose Create database.
  3. Choose a database engine and your instance options.
  4. Provide the information needed to set up a database per your enterprise standards.
  5. In the Additional configuration section, under Log exports, select Postgresql log to export the log file to CloudWatch Logs.
  6. Choose Continue.
  7. Choose a time to apply this change.

You can apply the change immediately or during the maintenance window. It doesn’t require a database instance reboot.

If you have an existing DB instance you want to modify, you can access the same options by selecting the instance and choosing Modify.

The PostgreSQL log is now exported to CloudWatch Logs. With the CloudWatch Logs metric filter, you can filter the log file on a keyword. If a matching event is in the PostgreSQL log file, an alarm in CloudWatch can send a notification to an SNS topic. The SNS topic can then send a notification to an endpoint such as email or SMS.

Create an SNS topic

To create an SNS topic, complete the following steps:

  1. On the Amazon SNS console, choose Topics in the navigation pane.
  2. Choose Create topic.
  3. For Type, select Standard.
  4. Provide the necessary details as per your enterprise standards.
  5. Choose Create topic.

Create an SNS subscription

After you create your SNS topic, you can create a subscription.

  1. On the Amazon SNS console, choose Subscriptions in the navigation pane.
  2. Choose Create subscription.
  3. For Topic ARN, choose the SNS topic you created earlier.
  4. For Protocol, choose Email.Other options are available, but for this post we create an email notification.
  5. For Endpoint, enter the email address to receive event notifications.
  6. Choose Create subscription.

Create a metric filter on the CloudWatch log group

You can use a metric filter in CloudWatch to create a new metric based on a filter pattern. This new metric is also placed in a custom namespace and can be grouped with similar metrics.

  1. On the CloudWatch console, under CloudWatch Logs, choose Log groups.
  2. Select the DB instance log group you want to create a filter pattern for.
  3. On the Actions menu, choose Create metric filter.
  4. For Filter pattern, enter your pattern or choose from the existing options.Optionally, you can test the pattern against the log group.
  5. Choose Next.
    On the Assign metric page, you can create a new metric namespace or use an existing namespace to group metrics together.
  6. Enter a metric name and metric value per your use case.This value is published to the metric when a match is found in the PostgreSQL log.
  7. Choose Next.

Create an alarm in CloudWatch

You’re now ready to create a CloudWatch alarm for when an event in the PostgreSQL log matches the filter pattern you created.

  1. On the CloudWatch console, navigate to the log group of the database cluster.
  2. On the Metric filters tab, select the metric you created earlier.
  3. Choose Create alarm.
  4. Set the period and threshold for the alarm according to your needs.The values in the following screenshot are for demonstration purposes only and shouldn’t be used in a production environment.
  5. Choose Next.
  6. For Alarm state trigger, select In alarm.
  7. For Select an SNS topic, select Select an existing SNS topic.
  8. For Send a notification to, choose the topic you created earlier.
  9. Choose Next.
  10. Enter a name for your alarm and an optional description.

Solution demonstration

In the following section, we demonstrate our solution with a metric filter on PostgreSQL error code 40P01, which we created in CloudWatch Logs. This error indicates a deadlock error. Using a PostgreSQL transaction block, two database sessions try to update the same table and create a deadlock.

This error gets logged in the PostgreSQL log file, as shown in the following screenshot.

The filter metric catches the error and invokes a CloudWatch alarm, which sends an email notification via Amazon SNS.

Conclusion

Catching errors early can prevent unexpected database performance issues and outages. Sometimes, errors in a database log can indicate a serious underlying condition that can cause unexpected outcomes. A good understanding of a normal pattern of errors in your environment will help you filter irrelevant errors and catch relevant errors for notification.

In this post, we walked through steps to create a metric filter on a CloudWatch log group, set up a CloudWatch alarm, and get notified. A deadlock error event occurred in a sample database which got captured in the database log file.

Exercise caution when using this approach and not set up alerts for common error codes. For more information, see Best practices for working with PostgreSQL and Best practices with Amazon Aurora PostgreSQL.


About the authors

Vignyanand Penumatcha is a Database Migration Consultant with the Professional Services team at AWS. He helps AWS customers to achieve their business needs when migrating their databases to AWS cloud database solutions.

 

 

 

Praveen Marthala is a Database Consultant with the Professional Services team at AWS. Praveen is passionate about building high-available, high-performance, high-security, and cost-effective databases that help customers achieve the best outcome with workloads on AWS.