AWS Database Blog

Working with RDS and Aurora PostgreSQL logs: Part 1

July 2023: This post was reviewed for accuracy.

PostgreSQL is one of the most popular open-source relational database systems. With more than 30 years of development work, PostgreSQL has proven to be a highly reliable and robust database that can handle a large number of complicated data workloads. AWS provides two managed PostgreSQL options: Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL. One of the common questions that new PostgreSQL users ask is how to capture the database activity logs for debugging and monitoring purposes. This post discusses how you can configure RDS and Aurora PostgreSQL to generate additional engine logs. The second part of this post, Working with RDS and Aurora PostgreSQL Logs: Part 2, shares methods to access these log files.

PostgreSQL generates event logs that contain useful information for DBAs. SQL query failures, failed login attempts, and deadlocks are captured in the database logs by default. These error messages help identify various application issues. For example, if you converted a legacy application from Oracle to PostgreSQL, some queries may not convert correctly to PostgreSQL syntax. These incorrectly formatted queries generate error messages in the logs, which can help identify the problematic application code.

In addition to the default logging, you can modify PostgreSQL logging parameters to capture beneficial information in identifying and solving issues such as poor performance and security audits. These logging parameters help capture information such as connections and disconnections, schema modification queries, all slow queries with the duration, queries taking time because they are waiting for locks, queries consuming temporary disk storage, and the backend auto-vacuum process consuming resources.

This log information can help troubleshoot potential performance and auditing issues while using the database. This post provides more details about enabling this logging and its benefits.

DB parameter groups

Each RDS and Aurora PostgreSQL instance is associated with a parameter group that contains the engine specific configurations. The engine configurations also include several parameters that control the PostgreSQL logging behavior. AWS provides the parameter groups with default configuration settings to use for your instances. However, to change the default settings, you must create a clone of the default parameter group, modify it as per your requirements, and attach it to your RDS or Aurora PostgreSQL instance.

For more information about parameter groups and the steps to create a custom parameter group for your instance, see Working with DB Parameter Groups in the RDS User Guide.

PostgreSQL log file

The log files store the engine logs that the RDS and Aurora PostgreSQL instances generate. PostgreSQL provides a few parameters when choosing the naming convention and rotation policy of the log file. These parameters provide the ability to generate periodic or fixed max size log files.

The parameter log_filename specifies the name of the log file. Aurora PostgreSQL 9.6 and RDS PostgreSQL allow the following two file naming options:

  • postgresql.log.%Y-%m-%d – This option uses a unique log file name for each day. For example, postgresql.log.2019-04-01.
  • postgresql.log.%Y-%m-%d-%H – This option uses a unique log file name for each hour of the day. For example, postgresql.log.2019-04-01-12.

Aurora PostgreSQL 10 introduced an additional log file name option, postgresql.log.%Y-%m-%d-%H%M. This option allows you to use a unique log file name with minute of the day. For example, the name of a new log file that starts at 12:30pm will be postgresql.log.2019-04-01-1230.

PostgreSQL creates and starts using a new log file when the conditions specified by parameters log_rotation_age or log_rotation_size are met. The parameter log_rotation_age specifies the maximum age (in minutes) for any log files. After this duration, PostgreSQL generates a new log file using the file naming convention. Similarly, the parameter log_rotation_size specifies the maximum size (in KB) of the log file. When the log file reaches this size, PostgreSQL generates a new log file using the file naming convention.

RDS and Aurora PostgreSQL do not overwrite the existing log files. For example, assume you are using the daily log naming convention (postgresql.log.%Y-%m-%d) and the log_rotation_size is 1 MB. In this example, when the file reaches 1 MB and the date has not changed, PostgreSQL continues writing to the same file without truncating it. The parameter log_truncate_on_rotation, which is hardcoded to false in RDS and Aurora PostgreSQL, controls this behavior.

PostgreSQL log output format

By default, RDS and Aurora PostgreSQL generate logs in standard error (stderr) format. In this format, each log message is prefixed with the information specified by the parameter log_line_prefix. RDS and Aurora only allow the following value for this parameter:

%t:%r:%u@%d:[%p]:

This value maps to the following code:

log-time : remote-host : user-name @ db-name : [ process-id ]

This prefix is useful, but you may get more detailed log messages if you enable the comma-separated-values (CSV) format log output.

The parameter log_destination controls the format in which the instance generates output logs. To enable CSV format log output, set the parameter log_destination to csvlog in the DB parameter group associated with your instance. This change triggers the RDS and Aurora PostgreSQL engine to generate a CSV output file in addition to the standard output file.

The CSV format is especially useful when using the log_fdw extension. For more information, see Part 2 of this post.

Default logging

By default, RDS and Aurora PostgreSQL logging parameters are set to capture all server errors such as query failures, login failures, fatal server errors, and deadlocks. This default logging level is essential to capture any errors that impact your server.

For example, when executing a query with an invalid column name, you get the following log message.

Standard error format

2019-03-10 03:54:59 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:ERROR: column "wrong_column_name" does not exist at character 8
2019-03-10 03:54:59 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:STATEMENT: SELECT wrong_column_name FROM logtest1;

CSV format

2019-03-10 03:54:59.748 UTC,"postgres","logtestdb",20175,"10.0.0.123:52834",5c848a8c.4ecf,3,"SELECT",2019-03-10 03:54:52 UTC,5/17090204,0,ERROR,42703,"column ""wrong_column_name"" does not exist",,,,,,"SELECT wrong_column_name FROM logtest1;",8,,"psql"

These error messages often help identify incorrectly formatted queries and missing objects. For example, perhaps while converting from Oracle to PostgreSQL, you missed one function. When the query tries to call this function, it fails with a missing function error message. This message can help DBA identify the cause of the failure and fix it quickly.

When the user authentication fails, you get the following log message.

Standard error format

2019-03-10 03:54:45 UTC:10.0.0.123(52830):postgres@logtestdb:[20128]:FATAL: password authentication failed for user "postgres"
2019-03-10 03:54:45 UTC:10.0.0.123(52830):postgres@logtestdb:[20128]:DETAIL: Password does not match for user "postgres".
Connection matched pg_hba.conf line 13: "host all all 0.0.0.0/0 md5"

CSV format

2019-03-10 03:54:45.088 UTC,"postgres","logtestdb",20120,"10.0.0.123:52828",5c848a85.4e98,2,"authentication",2019-03-10 03:54:45 UTC,5/17090196,0,FATAL,28P01,"password authentication failed for user ""postgres""","Password does not match for user ""postgres"".
Connection matched pg_hba.conf line 13: ""host all all 0.0.0.0/0 md5""",,,,,,,,""

This information can be valuable from a security point of view. The DBA looking at this information can decide to lock out the user if multiple failed login attempts are detected. The error message also provides the source IP address that helps the DBA narrow down the server from which the invalid connection requests are coming.

In addition to this default logging you can get more server activity details by configuring additional logging parameters. The following section discusses some of the key logging parameters that can help DBAs.

Configuring logging parameters

PostgreSQL users can enable additional logging by tuning the engine parameters. This section provides some of the key parameters you can set to capture more engine activity logs. For more information about additional logging parameters, see the section Error Reporting and Logging in the PostgreSQL documentation.

Increasing database logging impacts storage size, I/O use, and CPU use. Because of this, test these changes before deploying them in production.

rds.log_retention_period

This parameter specifies the retention period of the RDS and Aurora PostgreSQL engine log files (in minutes). By default, the logs are retained for 4320 minutes (three days). The max allowed value for this parameter is 10080 minutes (seven days). If you need to retain the logs for longer, you must move them out of this default location. Part 2 of this post discusses how to export the log files.

log_connections

By default, new connection details are not logged. To log all new client connection details, set this parameter to 1. The following output shows an example log for a new client connection.

Standard error format

user=postgres database=logtestdb SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES256-GCM-SHA384, compression=off)

CSV format

2019-03-10 03:51:56 UTC:10.0.0.123(52820):postgres@logtestdb:[18161]:LOG: connection authorized: 2019-03-10 03:51:56.158 UTC,"postgres","logtestdb",18161,"10.0.0.123:52820",5c8489dc.46f1,2,"authentication",2019-03-10 03:51:56 UTC,5/17090128,0,LOG,00000,"connection authorized: user=postgres database=logtestdb SSL enabled (protocol=TLSv1.2, cipher=ECDHE-RSA-AES256-GCM-SHA384, compression=off)",,,,,,,,,""

The connection logs can help determine the source, username, frequency, and time of the connections. This information can be useful in troubleshooting activities or security audits. Enabling this logging can help you answer questions such as how many new connections are received per day, what the count of connections coming from each host or user is, or if SSL is used for all connections.

In general, it is recommended to keep the connection logging enabled. Applications frequently use connection poolers, which result in connections opening one time and staying open for longer. When a smaller number of new connections open, fewer connection log messages are captured in the log files.

log_disconnections

By default, disconnection details are not logged. To log all client disconnections, set this parameter to 1. The following output shows an example log for a client disconnection.

Standard error format

2019-03-10 03:52:14 UTC:10.0.0.123(52820):postgres@logtestdb:[18161]:LOG: disconnection: session time: 0:00:18.720 user=postgres database=logtestdb host=10.0.0.123 port=52820

CSV format

2019-03-10 03:51:56 UTC,,0,LOG,00000,"disconnection: session time: 0:00:18.720 user=postgres database=logtestdb host=10.0.0.123 port=52820",,,,,,,,,"psql"

You can use the disconnection information to determine exactly how long a user was connected to the database. This information is useful for auditing purposes. In general, it is recommended to keep the disconnection logging enabled.

log_temp_files

Queries use temporary files on disk if an operation cannot be completed using the allowed memory (work_mem). These operations involve storing sorts, hashes, and temporary query results. The performance of disk access is much slower than the memory access; therefore, whenever the query needs to use temporary files, there is a performance impact. Because of this, it is important to enable temporary file usage logging. You can enable this logging by setting log_temp_files parameter to 0 (to log all temporary files) or to any positive number (default units KB). This positive number specifies the temporary file size threshold beyond which it is logged.

For example, you get the following log message when a query uses a temporary file that is greater than or equal to the threshold size specified by log_temp_files.

Standard error format

2019-03-10 03:58:18 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp20175.0", size 14032896
2019-03-10 03:58:18 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:STATEMENT: SELECT DISTINCT id FROM logtest1;

CSV format

2019-03-10 03:58:18.879 UTC,"postgres","logtestdb",20175,"10.0.0.123:52834",5c848a8c.4ecf,4,"SELECT",2019-03-10 03:54:52 UTC,5/17090205,0,LOG,00000,"temporary file: path ""base/pgsql_tmp/pgsql_tmp20175.0"", size 14032896",,,,,,"SELECT DISTINCT id FROM logtest1;",,,"psql"

The log message shows the size of the temporary file generated by the query. Using this information, you may increase the work_mem, if possible, or tune the query to minimize the temporary file usage.

log_lock_waits

Setting this parameter to 1 enables logging sessions that are stuck in a locked state for a duration longer than the duration set for deadlock detection (1 second by default). This information helps determine if session locking is causing a performance issue.

For example, when one transaction blocks another transaction, you get the following log message.

Standard error format

2019-03-10 04:01:22 UTC:10.0.0.123(52806):postgres@logtestdb:[9732]:LOG: process 9732 still waiting for ShareLock on transaction 113942590 after 1000.045 ms
2019-03-10 04:01:22 UTC:10.0.0.123(52806):postgres@logtestdb:[9732]:DETAIL: Process holding the lock: 20175. Wait queue: 9732.
2019-03-10 04:01:22 UTC:10.0.0.123(52806):postgres@logtestdb:[9732]:CONTEXT: while updating tuple (10809,11) in relation "logtest1"
2019-03-10 04:01:22 UTC:10.0.0.123(52806):postgres@logtestdb:[9732]:STATEMENT: update logtest1 set date = now() where id=1;

CSV format

2019-03-10 04:01:22.866 UTC,"postgres","logtestdb",9732,"10.0.0.123:52806",5c847d81.2604,1,"UPDATE waiting",2019-03-10 02:59:13 UTC,6/909514,113942591,LOG,00000,"process 9732 still waiting for ShareLock on transaction 113942590 after 1000.045 ms","Process holding the lock: 20175. Wait queue: 9732.",,,,"while updating tuple (10809,11) in relation ""logtest1""","update logtest1 set date = now() where id=1;",,,"psql"

This log message shows the query waiting for the lock and also provides the PID of the blocking session. This information helps debug performance issues. Using this information, you can identify the transactions that are blocking each other and can optimize them to avoid or reduce the locking. For example, two transactions that attempt to update the same row lock each other. This error message can help you isolate these transactions so that you can improve them.

log_autovacuum_min_duration

The auto-vacuum daemon process runs in the background as per the configurations and prevents bloat from accumulating in the database. This functionality is important, but it comes at a cost in terms of CPU, memory, and IO resource usage. The parameter log_autovacuum_min_duration helps gain visibility into when this background process runs and for which tables. Setting this parameter to 0 starts logging information of all auto-vacuum and auto-analyze runs. Setting the parameter to any positive value (units milliseconds) creates logs for only the auto-vacuum and auto-analyze runs that take more than this time.

For example, you get the following log message for the auto-vacuum and auto-analyze runs after setting this parameter and rds.force_autovacuum_logging_level (explained in the following section) set to log.

Standard error format

2019-03-10 04:07:12 UTC::@:[29679]:LOG: automatic vacuum of table "logtestdb.public.logtest1": index scans: 0
pages: 0 removed, 10811 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 1000001 removed, 1000000 remain, 0 are dead but not yet removable, oldest xmin: 113942594
buffer usage: 21671 hits, 0 misses, 1 dirtied
avg read rate: 0.000 MB/s, avg write rate: 0.003 MB/s
system usage: CPU: user: 0.12 s, system: 0.00 s, elapsed: 2.30 s
2019-03-10 04:07:14 UTC::@:[29679]:LOG: automatic analyze of table "logtestdb.public.logtest1" system usage: CPU: user: 0.06 s, system: 0.00 s, elapsed: 1.17 s

CSV format

2019-03-10 04:07:12.938 UTC,,,29679,,5c848d6e.73ef,1,,2019-03-10 04:07:10 UTC,7/15686043,0,LOG,00000,"automatic vacuum of table ""logtestdb.public.logtest1"": index scans: 0
pages: 0 removed, 10811 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 1000001 removed, 1000000 remain, 0 are dead but not yet removable, oldest xmin: 113942594
buffer usage: 21671 hits, 0 misses, 1 dirtied
avg read rate: 0.000 MB/s, avg write rate: 0.003 MB/s
system usage: CPU: user: 0.12 s, system: 0.00 s, elapsed: 2.30 s",,,,,,,,,""
2019-03-10 04:07:14.109 UTC,,,29679,,5c848d6e.73ef,2,,2019-03-10 04:07:10 UTC,7/15686044,113942594,LOG,00000,"automatic analyze of table ""logtestdb.public.logtest1"" system usage: CPU: user: 0.06 s, system: 0.00 s, elapsed: 1.17 s",,,,,,,,,""

This log message tells you when the auto-vacuum was running, the time it took to complete, and the cleanups it performed. This information is useful when you are investigating performance issues such as high I/O or CPU load during peak hours, and table bloat increasing because auto-vacuum can’t process old rows due to locks.

rds.force_autovacuum_logging_level

In RDS and Aurora PostgreSQL, logging auto-vacuum and auto-analyze processes is disabled by default. The auto-vacuum logging parameter log_autovacuum_min_duration does not work until you set this parameter to the desired values. The allowed values for this parameter are disabled, debug5, debug4, debug3, debug2, debug1, info, notice, warning, error, log, fatal, and panic.

When you set this parameter to a value other than disabled, such as log, the PostgreSQL engine starts logging the auto-vacuum and auto-analyze activity as per the threshold that the parameter log_autovacuum_min_duration set.

log_min_duration_statement

This parameter provides a valuable feature that enables logging slow queries. When set to a value (units milliseconds) other than -1 (means disabled), the database starts logging all queries that take at least the specified number of milliseconds.

For example, when a query takes time greater than or equal to the threshold that log_min_duration_statement specifies, you get the following log message.

Standard error format

2019-03-10 04:05:23 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:LOG: duration: 100.507 ms statement: SELECT count(*) FROM logtest1 where value<10;

CSV format

2019-03-10 04:05:23.199 UTC,"postgres","logtestdb",20175,"10.0.0.123:52834",5c848a8c.4ecf,6,"SELECT",2019-03-10 03:54:52 UTC,5/0,0,LOG,00000,"duration: 100.507 ms statement: SELECT count(*) FROM logtest1 where value<10;",,,,,,,,,"psql"

In addition to providing the duration of the query, the log message also provides information about who ran the query, from which source IP, and at what time. This information helps identify the source of the query and determine the next steps.

This parameter can be helpful in many scenarios. For example, if all your queries are expected to complete within 1 second, you can set this parameter to 1 second to capture query that takes more than this time. Similarly, when working on a query performance improvement task, you may want to start by focusing on the slowest queries and work on them first. You can do this by setting this parameter to a cutoff value and start capturing all queries that are taking more than this cutoff time. After you have optimized all these queries, you can reduce this parameter further and start working on the next set of slow queries.

log_statement

This is another useful query logging feature. The supported parameter values are as follows:

  • none” – This feature is off.
  • ddl” – Logs all DDL statements.
  • mod” – Logs all INSERT, UPDATE, and DELETE statements.
  • all” – Logs all statements.

It is a good idea to set this parameter to ddl to capture all schema change-related statements for auditing purposes.

For example, dropping a column captures the following log message.

Standard error format

2019-03-10 04:08:47 UTC:10.0.0.123(52834):postgres@logtestdb:[20175]:LOG: statement: ALTER TABLE logtest1 DROP COLUMN date;

CSV format

2019-03-10 04:08:47.606 UTC,"postgres","logtestdb",20175,"10.0.0.123:52834",5c848a8c.4ecf,8,"idle",2019-03-10 03:54:52 UTC,5/17090210,0,LOG,00000,"statement: ALTER TABLE logtest1 DROP COLUMN date;",,,,,,,,,"psql"

rds.force_admin_logging_level

RDS and Aurora PostgreSQL have an internal superuser named rdsadmin. This user is only used for database management activities. For example, you might forget the password for the master user and want to reset it. The rdsadminuser performs the password reset when you change your password from the AWS Management Console. This user’s activities aren’t captured even after setting the parameters discussed in previous sections. If there is a requirement to capture the activities from this user, enable this by setting rds.force_admin_logging_level to log. When you set this parameter, the PostgreSQL engine starts capturing all queries executed by this admin user in any of the user databases.

Beyond native logging parameters

In addition to the native parameters discussed in the previous section, PostgreSQL also supports external plugins with which you can further increase the level of the events logging. Two commonly used plugins that add useful logging information are pgAudit and auto_explain.

pgAudit

The pgAudit extension goes beyond the basic statement logging ability that the native logging parameters provide. Use this extension for session- or object-level information logging required for auditing. For example, you might need to log all queries that are executed against a table containing sensitive information. The only option available natively in PostgreSQL is to capture all queries running on the database by setting log_statement to all or setting log_min_duration_statement to 0. This allows you to get your desired data but also captures unnecessary data. The pgAudit extension helps in this use case by allowing you to choose what to capture and for which object.

For information about configuring pgAudit with Amazon RDS and Aurora PostgreSQL, see Working with the pgaudit Extension in the RDS User Guide.

For example, when you enable auditing for the test table logtest1, you get the following log message.

Standard error format

2019-03-12 02:34:53 UTC:10.0.0.123(53094):postgres@logtestdb:[18387]:LOG: AUDIT: OBJECT,3,1,READ,SELECT,TABLE,public.logtest1,SELECT * FROM logtest1 WHERE id=100;,<none>

CSV format

2019-03-12 02:34:53.779 UTC,"postgres","logtestdb",18387,"10.0.0.123:53094",5c86a479.47d3,15,"SELECT",2019-03-11 18:10:01 UTC,5/5279,0,LOG,00000,"AUDIT: OBJECT,3,1,READ,SELECT,TABLE,public.logtest1,SELECT * FROM logtest1 WHERE id=100;,<none>",,,,,,,,,"psql"

auto_explain

The auto_explain module provides a mechanism to automatically log the execution plan of the queries whenever they take more time than the threshold that the parameter auto_explain.log_min_duration specifies. You can enable the auto_explain functionality by adding auto_explain to the parameter shared_preload_libraries in the parameter group associated with the database instance. For more information about auto_explain configuration, see auto_explain in PostgreSQL documentation.

For example, you get the following log message when the test query duration crosses the set threshold.

Standard error format

2019-03-12 01:46:00 UTC:10.0.0.123(53094):postgres@logtestdb:[18387]:LOG: duration: 760.049 ms plan:
Query Text: select distinct * from logtest1 order by 1 desc;
Unique (cost=134143.34..141643.34 rows=1000000 width=8)
-> Sort (cost=134143.34..136643.34 rows=1000000 width=8)
Sort Key: id DESC, value
-> Seq Scan on logtest1 (cost=0.00..20811.00 rows=1000000 width=8)

CSV format

2019-03-12 01:46:00.593 UTC,"postgres","logtestdb",18387,"10.0.0.123:53094",5c86a479.47d3,6,"SELECT",2019-03-11 18:10:01 UTC,5/5257,0,LOG,00000,"duration: 760.049 ms plan:
Query Text: select distinct * from logtest1 order by 1 desc;
Unique (cost=134143.34..141643.34 rows=1000000 width=8)
-> Sort (cost=134143.34..136643.34 rows=1000000 width=8)
Sort Key: id DESC, value
-> Seq Scan on logtest1 (cost=0.00..20811.00 rows=1000000 width=8)",,,,,,,,,"psql"

Summary

This post highlights the importance of the PostgreSQL log files and explains some of the parameters you can tune to get additional database activity information for performance tuning and troubleshooting. You should evaluate and configure these parameters as per your requirements. Because enabling additional logging increases load on the server, test out the configuration changes before deploying them in production.

Part 2 of this post focuses on different methods to access and use the generated log files.

If you have any questions or comments, please post your thoughts in the comments section.

 


About the Author

Yaser Raja is a Senior Consultant with Professional Services team at Amazon Web Services. He works with customers to build scalable, highly available and secure solutions in AWS cloud. His focus area is homogenous and heterogeneous migrations of on-premise databases to AWS RDS and Aurora PostgreSQL.