How do I troubleshoot low freeable memory in my RDS for SQL Server instance?
Last updated: 2022-10-03
How can I troubleshoot low freeable memory in my Amazon Relational Database (Amazon RDS) for SQL Server instance?
Very low freeable memory causes unexpected downtime on your instance. So, it's important to monitor memory usage on the instance and take remedial actions.
Freeable memory is the amount of memory (RAM) available on the RDS instance. The total amount of memory available on an Amazon RDS instance depends on the instance class. For example, the instance class db.r5.8xlarge provides 32 vCPU and 256 GiB memory. If you provision an RDS instance with the db.r5.8xlarge instance class, the total memory on the instance class (256 GiB) is shared by the following:
- The operating system
- Amazon RDS processes
- The database engine
- Worker threads
- Business Intelligence suite (SSIS, SSAS, SSRS) applications, and so on.
For more information on how SQL Server uses memory, See the Memory management architecture guide on the Microsoft documentation website.
Monitor memory usage on your instance
Amazon CloudWatch metrics
Monitor the Amazon CloudWatch metrics for FreeableMemory to identify low memory occurrences. Along with FreeableMemory, you can monitor the following to identify an increase in workload when available memory is low:
Enhanced Monitoring can be turned on with different granularities, such as 1, 5, 10, 15, 30 or 60 seconds to monitor OS metrics for Microsoft SQL Server. It's a best practice to set the granularity as 1 or 5 seconds (the default is 60 seconds). You can create CloudWatch alarms to monitor your Amazon RDS for SQL Server DB instance’s memory consumption using Enhanced Monitoring
Troubleshoot low freeable memory
To troubleshoot low freeable memory issues, do the following:
Limit the memory that the RDS instance uses
Limit the memory that the RDS instance uses by setting max server memory to a value that doesn’t cause system-wide memory pressure. You can determine the max server memory value for your instance using the following formula:
max_server_memory = total_RAM – (1 GB for the OS + memory_basis_amount_of_RAM_on_the_server)
Total_RAM equals the instance class memory where the memory_basis_amount_of_RAM_on_the_server is determined as follows:
- If RAM on the server is between 4 GB and 16 GB: Leave 1 GB per 4 GB of RAM. For example, for a server with 16 GB, leave 4 GB.
- If RAM on the server is over 16 GB: Leave 1 GB per 4 GB of RAM up to 16 GB, and 1 GB per 8 GB of RAM above 16 GB.
For example, if a server has 64 GB of RAM, the calculation is as follows:
- 1 GB for the OS
- Up to 16 GB RAM: 16/4 = 4 GB
- Remaining RAM above 16 GB: (64-16)/8 = 6
- Total RAM to leave: 1 + 4 + 6 = 11 GB
- max_server_memory: 64 – 11 = 53 GB
- If you're using SSIS, SSAS, or SSRS on the instance, then max_server_memory must be adjusted to accommodate these components.
Example: You want to use SSRS with the RDS instance. Set the SSRS Max memory value to 10% (percentage of the total memory of the DB instance). This is about 6.4 GiB on an instance with 64 GiB memory. The max_server_memory value should be about 46 GiB (64-11-6.4).
- After initial configuration of max_server_memory, FreeableMemory must be constantly monitored to decide whether to increase or decrease the allocated memory.
To change the max_server_memory, configure the value using a custom parameter group. The value for max_server_memory needs to be provided in MB.
Note: The parameter max_server_memory is a dynamic parameter. So, a reboot isn't required for the changes to take effect.
Check database connections
Each database connection made to the instance requires some memory allocation outside the buffer pool for worker threads. So, a spike in DatabaseConnections can cause a drop in freeable memory.
Verify if SSIS, SSAS, or SSRS components are used on the DB instance
Using Amazon RDS for SQL Server option group, identify if SSIS, SSAS, or SSRS components are used on the DB instance. The memory used by these components exists outside of the max_server_memory setting. If you don't use these features, remove them by modifying the option group. Removing these features reduces the memory footprint on the instance.
Monitor the DB instance using Performance Insights
Using Performance Insights, you can monitor the DB instance for database performance analysis. You can use the Performance Insights dashboard to monitor database load, waits, queries, hosts, users, and so on. Monitoring these helps you identify potential bottlenecks that slow the instance down.
Perform regular maintenance on the DB instance
Perform regular index maintenance and keep statistics updated. Highly fragmented indexes can cause increase in I/O activity, which leads to more memory consumption. Similarly, outdated statistics can cause inaccurate cardinality estimation leading to selection of suboptimal query plan.
Note: It's a best practice to perform index and statistics maintenance during a non-peak time or during a maintenance window.
Monitor the Page Life expectancy and the Buffer Cache Hit Ratio
Page Life expectancy (PLE) indicates the number of seconds a page stays in the buffer pool without references.
Buffer Cache Hit Ratio (BCHR) is the percent of page requests satisfied by data pages from the buffer pool.
Monitoring PLE and BCHR identifies memory pressure. To monitor these metrics using Performance Insights, do the following:
- Open the Amazon RDS console.
- Select Performance Insights.
- Select the RDS for SQL Server instance that you want to monitor.
- Set the time range that you want to review the metrics for, and then select Manage Metrics.
- Select Database Metrics, Page Life Expectancy, Buffer Cache Hit Ratio.
For optimum performance, the values for these metrics should be as high as possible. You can monitor these metrics using Performance Insights. You might see that the values for these metrics are consistently low over a period of time. If this occurs, tune the queries accessing data or increase the instance class to provide more memory.
When there is memory pressure on the instance, along with low PLE and BCHR, PAGEIOLATCH wait increases. This means SQL Server is waiting for a page to be fetched from the disk and loaded into memory. Also, RESOURCE_SEMAPHORE wait might get noticed when memory requests by a query can't be honored due to memory crunch. This causes an increase in CPU usage because data pages aren't cached long enough in memory. When this occurs, SQL Server has to repeatedly access disk to access data causing performance issues.
Identify the queries using the most resources
Using Performance Insights, capture the queries utilizing the most resources and tune them for better performance.
Select the correct instance size for your workload
The amount of memory on an instance depends on the instance type. It's important to select an instance class with adequate resources so that the DB instance has enough resources for the workload. An instance with less resources encounters performance issues, and an oversized instance wastes resources.
Baseline your resource usage
Baseline resource usage on the instance by monitoring metrics such as FreeableMemory, Page Life Expectancy, Buffer Cache Hit Ratio, and so on. If there is a significant increase in data volume on the instance, increase the max_server_memory value. Make sure to increase the max_server_memory value in proportion to the data volume change to maintain the same level of performance on the instance.
Notes: You can use SQL Server native tools like Reports and DMVs to identify components using memory inside SQL Server. Using SQL Server Management Studio (SSMS), you can review SQL Server memory usage:
- Open SQL Server Management Studio (SSMS) and connect to your RDS for SQL Server instance.
- In Object Explorer, right-click the RDS instance endpoint name.
- Select Reports,Standard Reports, Memory Consumption.
Also, you can query sys.dm_os_memory_clerks DMV to identify the components that use the maximum memory inside SQL Server.