如何记录 Amazon RDS PostgreSQL 或 Aurora PostgreSQL 查询的执行计划以调优查询性能？

4 分钟阅读

我想知道我的查询正在使用哪些执行计划，以便我可以相应地对其进行调整。如何为 Amazon Relational Database Service (Amazon RDS) PostgreSQL 数据库实例或 Amazon Aurora PostgreSQL 数据库集群记录这些查询的执行计划？

解决方法

您可以使用 auto_explain 模块来记录查询的执行计划。

1. 将 auto_explain 添加到 shared_preload_Library 参数中。

2. 重启数据库实例或集群。

3. 要启用 auto_explain，必须将 auto_explain.log_min_duration 参数更改为 -1 以外的值。此值是以毫秒为单位的最短语句执行时间，如果语句被传递并且语句运行，则会记录其执行计划。如果将参数设置为 0，则会记录所有计划。

**重要事项：**将 auto_explain.log_min_duration 参数设置为 0 会导致性能下降和大量存储空间消耗。这可能会导致实例出现问题。

4. 在数据库上运行查询。

5. 查看或下载 PostgreSQL 日志文件，包括执行计划。以下示例执行计划具有执行时间（4673.182 毫秒）和带查询文本的执行计划：

2020-01-02 03:08:44 UTC:27.0.3.156(21356):postgres@postgres:[2040]:LOG: duration: 4673.182 ms plan:
Query Text: SELECT COUNT(*) FROM hoge ORDER BY 1;
Sort (cost=215575.00..215575.00 rows=1 width=8)
Sort Key: (count(*))
-> Aggregate (cost=215574.98..215574.99 rows=1 width=8)
-> Seq Scan on hoge (cost=0.00..190336.18 rows=10095518 width=0)

使用 auto_explain 参数来调优查询性能

除了启用 pg_autoexplain 之外，auto_explain.log_min_duration 参数还允许您使用其他有用的参数。您可以使用以下 auto_explain 参数（您可以在不重新启动的情况下进行更改）来调优查询性能。有关更多信息，请参阅每个参数的 PostgreSQL 文档。

auto_explain.log_analyze

在记录执行计划时，auto_explain.log_analyze 参数会打印 EXPLAIN ANALYZE 输出（而不是 EXPLA IN 输出）。此参数可能会因开销而降低查询性能，并且默认值处于关闭状态。有关更多信息，请参阅 Postgres 文档中的 auto_explain.log_analyze。

请参阅以下示例：

2020-02-01 07:42:09 UTC:27.0.3.145(29943):master@postgres:[789]:LOG:  duration: 18.658 ms  plan:
 Query Text: SELECT COUNT(*) FROM t1 INNER JOIN t2 USING(id);
 Aggregate  (cost=577.45..577.46 rows=1 width=8) (actual time=18.641..18.641 rows=1 loops=1)
   ->  Hash Join  (cost=269.98..552.45 rows=9999 width=0) (actual time=8.108..16.576 rows=9999 loops=1)
         Hash Cond: (t1.id = t2.id)
         ->  Seq Scan on t1  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.839..4.151 rows=9999 loops=1)
         ->  Hash  (cost=144.99..144.99 rows=9999 width=4) (actual time=7.186..7.186 rows=9999 loops=1)
               Buckets: 16384  Batches: 1  Memory Usage: 480kB
               ->  Seq Scan on t2  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.667..4.289 rows=9999 loops=1)

auto_explain.log_buffers

使用 auto_explain.log_buffers 参数来确定在记录执行计划时是否打印缓冲区使用情况统计信息。使用此参数与将缓冲区选项与解释一起使用具有相同的效果。一定要使用 auto_explain.log_buffers 启用 auto_explain.log_analyze。默认情况下，此参数处于关闭状态，仅当您是超级用户时才能更改此参数。有关更多信息，请参阅 PostgreSQL 文档中的 auto_explain.log_buffers。

请参见以下示例输出：

2020-02-01 08:02:02 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 14.875 ms  plan:
 Query Text: SELECT COUNT(*) FROM t1 INNER JOIN t2 USING(id);
 Aggregate  (cost=577.45..577.46 rows=1 width=8) (actual time=14.861..14.861 rows=1 loops=1)
   Buffers: shared hit=93
   ->  Hash Join  (cost=269.98..552.45 rows=9999 width=0) (actual time=5.293..12.768 rows=9999 loops=1)
         Hash Cond: (t1.id = t2.id)
         Buffers: shared hit=93
         ->  Seq Scan on t1  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.007..2.385 rows=9999 loops=1)
               Buffers: shared hit=45
         ->  Hash  (cost=144.99..144.99 rows=9999 width=4) (actual time=5.250..5.250 rows=9999 loops=1)
               Buckets: 16384  Batches: 1  Memory Usage: 480kB
               Buffers: shared hit=45
               ->  Seq Scan on t2  (cost=0.00..144.99 rows=9999 width=4) (actual time=0.009..2.348 rows=9999 loops=1)
                     Buffers: shared hit=45

auto_explain.log_nested_statements

使用 auto_explain.log_nested_statements 参数可考虑嵌套语句进行日志记录。默认情况下，将记录顶级查询计划，除非您启用此参数。默认值处于关闭状态。有关更多信息，请参阅 PostgreSQL 文档中的 auto_explain.nested_statements。

请参阅以下示例：

CREATE OR REPLACE FUNCTION count_tables() RETURNS integer AS $$
        DECLARE
         i integer;
         j integer;
        BEGIN
                SELECT COUNT(*) INTO i FROM t1;
                SELECT COUNT(*) INTO j FROM t2;
                RETURN i + j;                
        END;
$$ LANGUAGE plpgsql;

请参见以下示例输出：

2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 0.651 ms  plan:
 Query Text: SELECT COUNT(*)        FROM t1
 Aggregate  (cost=169.99..170.00 rows=1 width=8)
   ->  Seq Scan on t1  (cost=0.00..144.99 rows=9999 width=0)
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:CONTEXT:  SQL statement "SELECT COUNT(*)        FROM t1"
 PL/pgSQL function count_tables() line 6 at SQL statement
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 0.637 ms  plan:
 Query Text: SELECT COUNT(*)        FROM t2
 Aggregate  (cost=169.99..170.00 rows=1 width=8)
   ->  Seq Scan on t2  (cost=0.00..144.99 rows=9999 width=0)
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:CONTEXT:  SQL statement "SELECT COUNT(*)        FROM t2"
 PL/pgSQL function count_tables() line 7 at SQL statement
2020-02-01 08:15:25 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 1.435 ms  plan:
 Query Text: SELECT count_tables();
 Result  (cost=0.00..0.26 rows=1 width=4)

auto_explain.log_timing

使用 auto_explain.log_timing 参数来控制在记录执行计划时是否打印每个节点的时间安排信息。使用此参数的工作方式类似于将时间安排与解释一起使用。禁用 auto_explain.log_timing 可以减轻重复读取系统时钟的开销。默认值处于打开状态。有关更多信息，请参阅 PostgreSQL 文档中的 auto_explain.log_timing。

auto_explain.log_triggers

在记录执行计划时，使用 auto_explain.log_triggers 参数包括触发器执行统计信息。使用 auto_explain.log_triggers 时，还必须启用 auto_explain.log_analyze 默认值为关闭状态。

有关更多信息，请参阅 PostgreSQL 文档中的 auto_explain.log_triggers。

请参阅以下示例：

CREATE FUNCTION emp_log_trigger_func() RETURNS trigger AS $emp_stamp$
    DECLARE
        count integer;
    BEGIN
        SELECT COUNT(*) INTO count FROM emp WHERE empname = NEW.empname;
        INSERT INTO emp_log(created_at, operation, user_name) VALUES(current_timestamp, TG_OP, user);
        RETURN NEW;
    END;
$emp_stamp$ LANGUAGE plpgsql;
CREATE TRIGGER emp_stamp BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW EXECUTE PROCEDURE emp_log_trigger_func();

请参见以下示例输出：

2020-02-01 08:57:44 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 1.463 ms  plan:
 Query Text: INSERT INTO emp_log(created_at, operation, user_name) VALUES(current_timestamp, TG_OP, user)
 Insert on emp_log  (cost=0.00..0.03 rows=1 width=168) (actual time=1.461..1.461 rows=0 loops=1)
   ->  Result  (cost=0.00..0.03 rows=1 width=168) (actual time=0.009..0.010 rows=1 loops=1)
2020-02-01 08:57:44 UTC:27.0.3.145(27477):master@postgres:[15514]:CONTEXT:  SQL statement "INSERT INTO emp_log(created_at, operation, user_name) VALUES(current_timestamp, TG_OP, user)"
 PL/pgSQL function emp_log_trigger_func() line 3 at SQL statement
2020-02-01 08:57:44 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 1.602 ms  plan:
 Query Text: INSERT INTO emp VALUES('name', 100, current_timestamp, 'hoge');
 Insert on emp  (cost=0.00..0.01 rows=1 width=76) (actual time=1.600..1.600 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=76) (actual time=0.006..0.006 rows=1 loops=1)
 Trigger emp_stamp: time=1.584 calls=1

auto_explain.log_verbose

使用 auto_explain.log_verbose 参数来确定在记录执行计划时是否打印详细信息。使用此参数的工作方式类似于将详细信息选项与解释一起使用。默认值处于关闭状态。有关更多信息，请参阅 PostgreSQL 文档中的 auto_explain.log_verbose。

请参阅以下示例：

2020-02-01 09:03:20 UTC:27.0.3.145(27477):master@postgres:[15514]:LOG:  duration: 3.492 ms  plan:
 Query Text: SELECT COUNT(*) FROM t1 INNER JOIN t2 USING(id);
 Aggregate  (cost=577.45..577.46 rows=1 width=8)
   Output: count(*)
   ->  Hash Join  (cost=269.98..552.45 rows=9999 width=0)
         Hash Cond: (t1.id = t2.id)
         ->  Seq Scan on public.t1  (cost=0.00..144.99 rows=9999 width=4)
               Output: t1.id
         ->  Hash  (cost=144.99..144.99 rows=9999 width=4)
               Output: t2.id
               Buckets: 16384  Batches: 1  Memory Usage: 480kB
               ->  Seq Scan on public.t2  (cost=0.00..144.99 rows=9999 width=4)
                     Output: t2.id

或者，如果您想查看更少调优查询的解释计划，则可以在这些查询上运行 EXPLAIN ANALYZE。这免去了日志记录的开销，并逐步显示查询的计划以及每个步骤的执行时间。如果手动调用 EXPLAIN ANALYZE 语句，则可以识别事件，例如生成临时文件的步骤、哪个步骤正在执行 Seq 扫描等。

postgres=> EXPLAIN ANALYZE SELECT * FROM hoge ORDER BY 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Sort (cost=7343904.60..7444904.60 rows=40400000 width=4) (actual time=37795.033..45133.431 rows=40400000 loops=1)
Sort Key: id
Sort Method: external merge Disk: 553512kB
-> Seq Scan on hoge (cost=0.00..582762.00 rows=40400000 width=4) (actual time=0.579..8184.498 rows=40400000 loops=1)
Planning Time: 0.549 ms
Execution Time: 47161.516 ms
(6 rows)

如何记录 Amazon RDS PostgreSQL 或 Aurora PostgreSQL 查询的执行计划以调优查询性能？

解决方法

使用 auto_explain 参数来调优查询性能

相关信息

相关视频

相关内容