What is stored in Amazon Aurora for PostgreSQL storage, and how can I troubleshoot storage issues?

Last updated: 2019-11-14

What is stored in Amazon Aurora PostgreSQL storage? How can I troubleshoot storage issues?

Short Description

DB Instances that are in Aurora clusters have two types of storage:

  • Storage used for persistent data (shared cluster storage). For more information, see What the Cluster Volume Contains.
  • Storage used for temporary data and logs (local storage). All DB temporary files (for example, logs and temporary tables) are stored in the instance local storage. This includes sorting operations, hash tables, and grouping operations that are required by queries.

Each Aurora instance contains a limited amount of local storage that is determined by the instance class. Typically, the amount of local storage is twice the amount of memory on the instance. If you perform a sort or index creation operation that requires more memory than is available on your instance, Aurora uses the local storage to fulfill the operation.

Resolution

You can monitor the local storage space associated with the Aurora instance or node by using the Amazon CloudWatch metric for FreeLocalStorage. This metric reports the amount of storage available to each DB instance for temporary tables and logs. For more information, see Monitoring Amazon Aurora DB Cluster Metrics.

If your Aurora local storage is full, see the following error messages and troubleshooting steps:

Local storage space is used by temporary tables or files

"ERROR: could not write block XXXXXXXX of temporary file: No space left on device."

This error can occur when the temporary storage is exhausted on the DB instance. Operations that alter large tables, add indexes on large tables or that perform large SELECT queries with complex JOINs or GROUP BY or ORDER BY clauses can lead to these errors.

You can use the following methods to check temporary tables and temporary files size:

1.    For temporary files, enable the log_temp_files parameter on the Aurora PostgreSQL instance. This parameter logs the use of temporary files that are larger than the number of specified kilobytes. After this parameter is enabled, a log entry is made for each temporary file when the file is deleted. A value of 0 logs all temporary file information, and a positive value logs only the files that are larger than or equal to the specified number of kilobytes. The default value is -1, which disables temporary file logging. You can use this parameter to identify the temporary file details, and then you can relate these temporary files with the Free Local Storage metric.

Note: Enabling the log_temp_files parameter can cause excessive logging on the Aurora PostgreSQL instance. For this reason, it's a best practice to check the size of the Aurora Postgres log files before enabling log_temp_files. If the Aurora Postgres log files are consuming the maximum space of the local storage, you can reclaim space by reducing the value for rds.log_retention. The default value for rds.log_retention is three days.

You can also review temporary files by using the delta of subsequent runs of the following command:

maxiops=> select datname, temp_files , pg_size_pretty(temp_bytes) as temp_file_size  FROM   pg_stat_database order by temp_bytes desc;

Note: In the temp_files column, all temporary files are counted—regardless of when the temporary file was created (for example, by sorting or hashing). The columns temp_files and temp_bytes in view pg_stat_database are collecting statistics for the accumulated value. This value can be reset by using the pg_stat_reset() function or by restarting the DB. For more information, see the PostgreSQL Documentation for Additional Statistics Functions.

Note: If you use Aurora PostgreSQL or Amazon RDS for PostgreSQL 10 or 11, you can monitor temp_bytes and temp_files by using Performance Insights. Performance Insights provides native counters for your DB engine's internal metrics, in addition to wait events. For more information, see Native Counters for Amazon RDS PostgreSQL.

2.    For temporary tables, run a query similar to the following:

maxiops=> SELECT
n.nspname as SchemaName
,c.relname as RelationName
,CASE c.relkind
WHEN 'r' THEN 'table'
WHEN 'v' THEN 'view'
WHEN 'i' THEN 'index'
WHEN 'S' THEN 'sequence'
WHEN 's' THEN 'special'
END as RelationType
,pg_catalog.pg_get_userbyid(c.relowner) as RelationOwner
,pg_size_pretty(pg_relation_size(n.nspname ||'.'|| c.relname)) as RelationSize
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n
    ON n.oid = c.relnamespace
WHERE  c.relkind IN ('r','s')
AND  (n.nspname !~ '^pg_toast' and nspname like 'pg_temp%')
ORDER BY pg_relation_size(n.nspname ||'.'|| c.relname) DESC;

You can also increase maintenance_work_mem and work_mem to allocate more memory to the processes that are performing the operation. This uses more memory for the operation, which can use less temporary disk storage. For more information about these parameters, see the PostgreSQL Documentation for maintenance_work_mem and work_mem. It's a best practice to set the values for maintenance_work_mem and work_mem at a query or session level to avoid running out of memory. For more information, see Amazon Aurora PostgreSQL Reference.

You can also increase the size of your Aurora instance so that the instance as more available local storage.

Local storage used by log files

Excessive logging can also cause your instance to run out of local storage. Excessive logging can be caused by one or more of the following logging parameters:

rds.log_retention_period
auto_explain.log_min_duration
log_connections
log_disconnections
log_lock_waits
log_min_duration_statement
log_statement
log_statement_stats 

To identify which parameter is causing the excessive logging, analyze the PostgreSQL logs to find the largest ones. Then, identify which parameter is responsible for the majority of the entries in those logs. You can then modify the parameter that is causing the excessive logging. If you're repeatedly executing a query that is failing with an error, the errors are logged by default to the PostgreSQL error log. Review the errors logged, and then fix the failing query to prevent logs from using excessive storage. You can also reduce the default value for rds.log_retention (three days) to reclaim space.

You can also increase the size of your Aurora instance so that the instance as more available local storage.


Did this article help you?

Anything we could improve?


Need more help?