Why is my RDS for SQL Server instance stuck in the rebooting state?

4 minute read

Why is it taking a long time to reboot my Amazon Relational Database Service (Amazon RDS) for Microsoft SQL Server instance?

Short description

Rebooting your DB instance might take more time than expected if there are a large number of ongoing transactions on the instance. Or, rebooting might appear to be stuck if there are a high number of virtual log files (VLFs) on the instance that are slowing down the reboot.

To resolve these issues, do one or both of the following:

Check for ongoing queries on the DB instance.
Reduce the number of VLFs on the DB instance.

Resolution

Check for ongoing queries on the DB instance

Rebooting stops all ongoing transactions and SQL Server runs recovery while starting the instance. SQL Server performs roll forward and rollback of the transactions to take the database to a consistent state.

You can see the time taken by this recovery process in the SQL Server error log. The log entry includes the time consumed in each recovery phase. In the following log entry example, X is the amount of time taken by SQL Server in each phase and in complete recovery. If you have large ongoing transactions, the reboot might take a long time.

Recovery completed for database <<DB_NAME>> (database ID <<id of database>>) in X second(s) (analysis X ms, redo X ms, undo X ms [system undo X ms, regular undo X ms].) This is an informational message only. No user action is required.

To resolve this issue, reduce the number of ongoing queries. Use the following command to check if there are any active data modification transactions on the database.

SELECT r.session_id,
       r.start_time,
       r.status,
       r.cpu_time,
       r.total_elapsed_time,
       st.TEXT AS batch_text
FROM sys.dm_exec_requests AS r
     CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) AS st
     CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) AS qp
     where session_id <> @@SPID
ORDER BY cpu_time DESC;

The command output provides information about the sessions running on your database along with start time and SQL text. If there are queries that are still running, then allow the queries to complete before performing a reboot.

Reduce the number of VLFs on the DB instance

Rebooting might take a long time due to a high number of VLFs on the instance. An excessive number of VLFs might accumulate from an initially small transaction log that was grown (manually or automatically) in very small increments.

Before starting the recovery phases during reboot, SQL Server scans all of the VLFs serially. This phase is called the discovery phase. If a large number of VLFs are found, the following notification appears in the error log:

Database %ls has more than %d virtual log files which is excessive. Too many virtual log files can cause long startup and backup times. Consider shrinking the log and using a different growth increment to reduce the number of virtual log files.

To reduce the number of VLFs, do the following:

Note: It's a best practice to perform these steps in slow or non-business hours.

Log in to RDS for SQL Server instance from Microsoft SQL Server Management Studio as the master user.
Run DBCC SQLPERF(LOGSPACE) query to check log file utilization.
Shrink the log file of the database that received the high VLFs notification in the error log.
Expand the transaction log file to an appropriate usage size all at once. This avoids automatic growth that causes a high number of VLFs.

Note: Shrinking the log file reduces the number of VLFs. Performing a one-time expansion creates a limited amount of VLFs required per the VLF creation criteria. For more information, see Virtual Log Files (VLFs) in the Microsoft SQL docs.

Related information

How can I troubleshoot storage consumption in my Amazon RDS DB instance that is running SQL Server?

Topics

Database Migration & Modernization Analytics

Relevant content

RDS SQL server Instance stuck in Modifying state
Suresh
asked 3 months ago
RDS instance stuck in rebooting
Michael
asked 6 months ago
Does RDS Custom for SQL Server support named instances ? (More than one SQL Server instance)
Accepted Answer
light
asked 10 months ago
RDS Custom for SQL Server the status is blocked on DELETING for more than 24 hours
Accepted Answer
Ken
asked 7 months ago
My EC2 instance stuck on stopping state more than 10 minutes
AWS-User-2577527
asked 2 years ago
Why is my Amazon RDS for MySQL DB instance stuck in "Rebooting"?
AWS OFFICIALUpdated 2 years ago
Why is my Amazon RDS for MySQL DB instance using more storage than expected?
AWS OFFICIALUpdated a year ago
Why is my Amazon RDS DB instance in the storage-optimization state for a long time?
AWS OFFICIALUpdated a year ago
Why is my Amazon RDS for Oracle DB instance using more storage than expected?
AWS OFFICIALUpdated a year ago
[UPDATED MESSAGING] Amazon RDS for SQL Server is cancelling end of support campaign for some older DB minor versions of SQL Server 2017, 2019.
EXPERT
appadman
published 6 months ago