How do I detect and release locks in Amazon Redshift?
Last updated: 2020-10-06
There are table locks blocking my queries in Amazon Redshift. How do I find them and resolve this?
Locking is a protection mechanism that controls how many sessions can access a table at the same time. Locking also determines which operations can be performed in those sessions. Most relational databases use row-level locks. However, Amazon Redshift uses table-level locks. You might experience locking conflicts if you perform frequent DDL statements on user tables or DML queries.
Amazon Redshift has three lock modes:
- AccessExclusiveLock: Acquired primarily during DDL operations, such as ALTER TABLE, DROP, or TRUNCATE. AccessExclusiveLock blocks all other locking attempts.
- AccessShareLock: Acquired during UNLOAD, SELECT, UPDATE, or DELETE operations. AccessShareLock blocks only AccessExclusiveLock attempts. AccessShareLock doesn't block other sessions that are trying to read or write on the table.
- ShareRowExclusiveLock: Acquired during COPY, INSERT, UPDATE, or DELETE operations. ShareRowExclusiveLock blocks AccessExclusiveLock and other ShareRowExclusiveLock attempts, but doesn't block AccessShareLock attempts.
When a query or transaction acquires a lock on a table, the lock remains for the duration of the query or transaction. Other queries or transactions that are waiting to acquire the same lock are blocked. For more information about why a query might hang, see Query hangs.
To solve a locking problem, identify the session (PID) that is holding the lock and then terminate the session. If the session doesn't terminate, reboot your cluster.
Run a query to identify sessions that are holding locks:
select a.txn_owner, a.txn_db, a.xid, a.pid, a.txn_start, a.lock_mode, a.relation as table_id,nvl(trim(c."name"),d.relname) as tablename, a.granted,b.pid as blocking_pid ,datediff(s,a.txn_start,getdate())/86400||' days '||datediff(s,a.txn_start,getdate())%86400/3600||' hrs '||datediff(s,a.txn_start,getdate())%3600/60||' mins '||datediff(s,a.txn_start,getdate())%60||' secs' as txn_duration from svv_transactions a left join (select pid,relation,granted from pg_locks group by 1,2,3) b on a.relation=b.relation and a.granted='f' and b.granted='t' left join (select * from stv_tbl_perm where slice=0) c on a.relation=c.id left join pg_class d on a.relation=d.oid where a.relation is not null;
The output looks like this:
txn_owner | txn_db | xid | pid | txn_start | lock_mode | table_id | tablename | granted | blocking_pid | txn_duration | ----------+--------+---------+-------+----------------------------+---------------------+----------+-----------+---------+--------------+-----------------------------+ usr1 | db1 | 5559898 | 19813 | 2018-06-30 10:51:57.485722 | AccessExclusiveLock | 351959 | lineorder | t | | 0 days 0 hrs 0 mins 52 secs | usr1 | db1 | 5559927 | 20450 | 2018-06-30 10:52:19.761199 | AccessShareLock | 351959 | lineorder | f | 19813 | 0 days 0 hrs 0 mins 30 secs | usr1 | db1 | 5559898 | 19813 | 2018-06-30 10:51:57.485722 | AccessShareLock | 351959 | lineorder | t | | 0 days 0 hrs 0 mins 52 secs |
If the result in the granted column is f (false), it means that a transaction in another session is holding the lock. The blocking_pid column shows the process ID of the session that's holding the lock. In this example, PID 19813 is holding the lock.
To release a lock, wait for the transaction that's holding the lock to finish. You can also manually terminate the session by running the following command:
Terminating a PID rolls back all running transactions and releases all locks in the session. The locks are then claimed by the other transactions that are waiting to acquire the lock.
Note: When PG_TERMINATE_BACKEND(PID) returns "1", it typically indicates a successful termination request to the PID. However, this doesn't guarantee that the PID actually got terminated. In some cases, the PID cannot be determined depending on its internal status. Therefore, it's a best practice to also check STV_SESSIONS (and other relevant system tables) to confirm whether the PID was actually terminated.
If PG_TERMINATE_BACKEND(PID) did not successfully terminate, then reboot the cluster to terminate the process.