Why isn't the Amazon Redshift VACUUM operation reclaiming disk space?

Last updated: 2019-12-09

I'm running a VACUUM FULL or VACUUM DELETE ONLY operation on an Amazon Redshift table that contains rows marked for deletion. The operation appears to complete successfully. Why is no disk space reclaimed?

Short Description

This issue commonly occurs when you have long-running active transactions. VACUUM runs only on rows with a transaction ID (xid) that's less than the oldest active xid on the cluster. If the transaction that marked rows for deletion started after a long-running transaction—and the long-running transaction is still active when the VACUUM operation runs—the rows that were marked for deletion aren’t reclaimed.

Resolution

1.    To check for long-running active transactions on the cluster, run a query similar to the following:

rsdb=# select *,datediff(s,txn_start,getdate())/86400||' days '||datediff(s,txn_start,getdate())%86400/3600||' hrs '||datediff(s,txn_start,getdate())%3600/60||' mins '||datediff(s,txn_start,getdate())%60||' secs' duration from svv_transactions where lockable_object_type='transactionid' and  pid<>pg_backend_pid() order by 3;

The output shows that xid 50341 started at 2019-08-19 20:20:33 and has been active for 19 minutes and 37 seconds:

txn_owner | txn_db |  xid  |  pid  |         txn_start          |   lock_mode   | lockable_object_type | relation | granted |           duration
-----------+--------+-------+-------+----------------------------+---------------+----------------------+----------+---------+------------------------------
 master    | rsdb   | 50341 | 21612 | 2019-08-19 20:20:33.147622 | ExclusiveLock | transactionid        |          | t       | 0 days 0 hrs 19 mins 37 secs
(1 row)

2.    To see when rows were deleted from a table, run a query similar to the following:

select a.query, a.xid, trim(c.name) tablename, b.deleted_rows, a.starttime, a.endtime
from stl_query a 
join (select query, tbl, sum(rows) deleted_rows from stl_delete group by 1,2) b 
on a.query = b.query
join (select id, name from stv_tbl_perm group by 1,2) c 
on c.id = b.tbl 
where a.xid in (select distinct xid from stl_commit_stats)
and trim(c.name) = 'tablename'
order by a.starttime;

The output shows that xid 50350 (the transaction in which rows were marked for deletion) started after the long-running active transaction (xid 50341):

query |  xid  | tablename | deleted_rows |         starttime          |          endtime
-------+-------+-----------+--------------+----------------------------+----------------------------
 18026 | 50350 | test      |            5 | 2019-08-19 20:20:48.137594 | 2019-08-19 20:20:50.125609
(1 rows)

To allow VACUUM DELETE to reclaim these deleted rows, choose one of the following options, and then rerun the VACUUM operation:

  • Wait for the long-running transaction to complete.
  • Use the PG_TERMINATE_BACKEND statement to terminate the session that's holding the long-running transaction.

Investigating long-running transactions

Query the SVL_STATEMENTTEXT view to check activity in a long-running active transaction:

rsdb=# select pid, xid, trim(label), starttime, endtime, trim(text) from svl_statementtext where xid = 50341 order by starttime , sequence;

Example output:

  pid  |  xid  |  btrim  |         starttime          |          endtime           |          btrim
-------+-------+---------+----------------------------+----------------------------+--------------------------
 21612 | 50341 | default | 2019-08-19 20:20:31.733843 | 2019-08-19 20:20:31.733844 | begin;
 21612 | 50341 | default | 2019-08-19 20:20:33.146937 | 2019-08-19 20:20:35.020556 | select * from sometable;
(2 rows)

Query the STV_INFLIGHT view to see if queries are running in the transaction:

rsdb=# select query, xid, pid, starttime, trim(text) from stv_inflight where xid = 50341;

Example output:

 query | xid | pid | starttime | btrim
-------+-----+-----+-----------+-------
(0 rows)

Common issues that cause long-running transactions

The following behavior can result in transactions staying open for long periods of time:

  • A user starts an implicit transaction from a client with auto-commit disabled. The transaction remains active until the user explicitly closes the transaction (COMMIT or ROLLBACK), or until the session is terminated.
  • A user starts a transaction explicitly (BEGIN), but never closes the transaction with a COMMIT or ROLLBACK.

Did this article help you?

Anything we could improve?


Need more help?