Why isn't the Amazon Redshift VACUUM operation reclaiming disk space?

Last updated: 2020-05-18

I'm running a VACUUM FULL or VACUUM DELETE ONLY operation on an Amazon Redshift table that contains rows marked for deletion. The operation appears to complete successfully. Why isn't there any reclaimed disk space?

Short Description

Disk space might not get reclaimed if there are long-running transactions that remain active. When rows are deleted, a hidden metadata identity column, DELETE_XID, is marked with the transaction ID that deleted the row. If there is an active long-running transaction that began before the deletion, VACUUM can't clean up the rows. Therefore, disk space can't be reclaimed. For more information about the DELETE_XID column, see Optimizing storage for narrow tables.

Resolution

1.    To check for long-running transactions on the cluster, run the following query:

rsdb=# select *,datediff(s,txn_start,getdate())/86400||' days '||datediff(s,txn_start,getdate())%86400/3600||' hrs '||datediff(s,txn_start,getdate())%3600/60||' mins '||datediff(s,txn_start,getdate())%60||' secs' duration from svv_transactions where lockable_object_type='transactionid' and  pid<>pg_backend_pid() order by 3;

The following output shows that xid 50341 has been active for 19 minutes and 37 seconds:

txn_owner  | txn_db |  xid  |  pid  |         txn_start          |   lock_mode   | lockable_object_type | relation | granted |           duration
-----------+--------+-------+-------+----------------------------+---------------+----------------------+----------+---------+------------------------------
 master    | rsdb   | 50341 | 21612 | 2019-08-19 20:20:33.147622 | ExclusiveLock | transactionid        |          | t       | 0 days 0 hrs 19 mins 37 secs
(1 row)

2.    Run the following query to confirm whether rows were deleted from the Amazon Redshift table:

select a.query, a.xid, trim(c.name) tablename, b.deleted_rows, a.starttime, a.endtime
from stl_query a 
join (select query, tbl, sum(rows) deleted_rows from stl_delete group by 1,2) b 
on a.query = b.query
join (select id, name from stv_tbl_perm group by 1,2) c 
on c.id = b.tbl 
where a.xid in (select distinct xid from stl_commit_stats)
and trim(c.name) = 'tablename'
order by a.starttime;

The following output shows that the transaction marked for row deletion (xid 50350) started after the long-running transaction (xid 50341):

query  |  xid  | tablename | deleted_rows |         starttime          |          endtime
-------+-------+-----------+--------------+----------------------------+----------------------------
 18026 | 50350 | test      |            5 | 2019-08-19 20:20:48.137594 | 2019-08-19 20:20:50.125609
(1 rows)

To allow VACUUM DELETE to reclaim these deleted rows, choose one of the following options, and then rerun the VACUUM operation:

  • Wait for the long-running transaction to complete.
  • Use the PG_TERMINATE_BACKEND statement to terminate the session that's holding the long-running transaction.

Investigate the long-running transactions

Query the SVL_STATEMENTTEXT view to check the activity in a long-running transaction:

rsdb=# select pid, xid, trim(label), starttime, endtime, trim(text) from svl_statementtext where xid = 50341 order by starttime , sequence;

Here's an example output:

  pid  |  xid  |  btrim  |         starttime          |          endtime           |          btrim
-------+-------+---------+----------------------------+----------------------------+--------------------------
 21612 | 50341 | default | 2019-08-19 20:20:31.733843 | 2019-08-19 20:20:31.733844 | begin;
 21612 | 50341 | default | 2019-08-19 20:20:33.146937 | 2019-08-19 20:20:35.020556 | select * from sometable;
(2 rows)

Query the STV_INFLIGHT view to check whether queries are running in the transaction:

rsdb=# select query, xid, pid, starttime, trim(text) from stv_inflight where xid = 50341;

Here's an example output:

 query | xid | pid | starttime | btrim
-------+-----+-----+-----------+-------
(0 rows)

Common issues that cause long-running transactions

The following behavior can result in long-running transactions:

  • A user starts an implicit transaction from a client where automatic commit is disabled. The transaction remains active until the user explicitly closes the transaction with the COMMIT or ROLLBACK command, or until the session is terminated.
  • A user starts a transaction explicitly using BEGIN, but never closes the transaction with the COMMIT or ROLLBACK command.

Did this article help you?

Anything we could improve?


Need more help?