I am experiencing high CPU utilization on my Amazon Relational Database Service (Amazon RDS) for MySQL, MariaDB, or Amazon Aurora for MySQL instances. How can I troubleshoot and resolve high CPU utilization?
Several factors can cause an increase in CPU utilization. For example, user-initiated heavy workloads, analytic queries, prolonged deadlocks and lock waits, multiple concurrent transactions, long-running transactions, or other processes that utilize CPU resources.
First, you can identify the source of the CPU usage by:
- Using Enhanced Monitoring
- Using Performance Insights
- Using Queries to detect the cause of CPU utilization in the workload
- Analyzing logs and enabling monitoring
After you identify the source, you can analyze and optimize your workload to reduce CPU usage.
Using Enhanced Monitoring
Enhanced Monitoring provides granular real-time metrics that you can review in addition to Amazon CloudWatch metrics, which provide statistics each minute. For more information, see the Differences Between CloudWatch and Enhanced Monitoring Metrics.
In the operating system (OS) process list section of Enhanced Monitoring, review the OS processes and RDS processes to confirm the percentage of CPU utilization of a mysqld or Aurora process. These metrics can help you confirm if the increase in CPU utilization is caused by OS or RDS processes. You can also use these metrics to confirm if the increase is caused by the mysqld or Aurora, which can indicate that a user-initiated workload is the source of the CPU usage. For more information, see Viewing Enhanced Monitoring. You can also see the bifurcation of CPU utilization by reviewing the metrics for cpuUtilization. For more information, see the Available OS Metrics.
You can also check the number of tasks that aren't in use (sleeping tasks). These tasks can lead to increased memory resource (RAM, cache, and processor) consumption, which can slow down the server. It's a best practice to tune your application to gracefully close the connections that aren't in use. You can also modify the values for the wait_timeout and interactive_timeout parameters to close the connection based on the value you set. For more information, see the MySQL Documentation for wait_timeout and interactive_timeout.
Using Performance Insights
You can use Performance Insights to identify the exact queries that are running on the instance and causing high CPU usage. First, enable Performance Insights for MySQL. Then, you can use Performance Insights to optimize your workload after consulting with your DBA.
To see which database engines are available to use with Performance Insights, see Using Amazon RDS Performance Insights.
Using queries to detect the cause of CPU utilization in the workload
Before you can optimize your workload, you must identify the problematic query. You can run the following queries while the high CPU issue is occurring to identify the root cause of the CPU utilization. Then, you can optimize your workload to reduce your CPU usage.
The SHOW PROCESSLIST command shows you which threads are running currently on your MySQL instance. Sometimes, the same set of statements is running but not finishing. When this happens, the subsequent statements must wait for the first set of statements to finish, because InnoDB row-level locking might be updating the same rows. For more information, see the MySQL Documentation for SHOW PROCESSLIST Syntax.
SHOW FULL PROCESSLIST;
Note: Run the SHOW PROCESSLIST query as the master user. If you are not the master user, the MySQL user running the command must have MySQL PROCESS server administration privileges to see all the threads running on a MySQL instance. Without admin privileges, SHOW PROCESSLIST shows only the threads associated with the MySQL account that you're using.
The INNODB_TRX table provides information about all currently executing InnoDB transactions that are not read-only transactions.
SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;
The INNODB_LOCKS table provides information about locks that an InnoDB transaction has requested but hasn't received.
SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCKS;
The INNODB_LOCK_WAITS table provides one or more rows for each blocked InnoDB transaction.
SELECT * FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS;
You can run a query similar to the following to help you see which transactions are waiting and which transactions are blocking the waiting transactions. For more information, see the MySQL Documentation for Using InnoDB Transaction and Locking Information.
SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread, r.trx_query waiting_query, b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread, b.trx_query blocking_query FROM information_schema.innodb_lock_waits w INNER JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id INNER JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;
The SHOW ENGINE INNODB STATUS query provides information from the standard InnoDB monitor about the state of the InnoDB storage engine. For more information, see the MySQL Documentation for SHOW ENGINE Syntax.
SHOW ENGINE INNODB STATUS;
The SHOW GLOBAL SESSION STATUS provides information about the server status. For more information, see the MySQL Documentation for SHOW STATUS Syntax.
SHOW GLOBAL STATUS;
Note: These queries have been tested on Aurora 2.02.5 (MySQL 5.7); Aurora 1.17.8 (MySQL 5.6); MySQL 5.6.x and 5.7.x; and MariaDB 10.0.x, 10.1.x, 10.2.x.
Analyzing logs and enabling monitoring
You can analyze the MySQL General Query Log to view what the mysqld is doing at a specific time. You can also view the queries that are running on your instance at a specific time, including information about when clients connect or disconnect. For more information, see the MySQL Documentation for The General Query Log.
Note: When you enable the General Query Log for long periods, the logs consume storage and can add to performance overhead.
Analyze the MySQL Slow Query Logs to find queries that take longer to execute than the seconds set for long_query_time. You can also review your workload and analyze your queries to improve performance and memory consumption. For more information, see the MySQL Documentation for The Slow Query Log.
Tip: When you use Slow Query Log or General Query Log, set the parameter log_output to FILE.
You can also use the MariaDB Audit Plugin to audit database activity such as users who are logging on to the database or queries that are run against the database. For more information, see MariaDB Audit Plugin Support.
If you use Aurora for MySQL, you can also use Advanced Auditing. Auditing can give you more control over the types of queries you want to log, and it reduces the overhead for logging.
You can use the innodb_print_all_deadlocks parameter to check for deadlocks and resource locking. You can use this parameter to record information about deadlocks in InnoDB user transactions in the MySQL error log. For more information, see the MySQL Documentation for innodb_print_all_deadlocks.
Analyzing and optimizing the high CPU workload
After you identify the query that is increasing CPU usage, you can optimize your workload to reduce the CPU consumption.
If you see a query that's not required for your workload, you can terminate the connection (kill the query) using the following command:
You can find the processID of the query by running the SHOW FULL PROCESSLIST command.
If you don't want to kill the query, you can optimize the query using EXPLAIN. This shows the individual steps involved in query execution. For more information, see the MySQL Documentation for Optimizing Queries with EXPLAIN.
Enable the PROFILING to review profile details that can indicate resource usage for statements that are executed during the current session. For more information, see the MySQL Documentation for PROFILING Syntax.
Use ANALYZE TABLE to refresh the index statistics for the tables. This can help the optimizer choose an appropriate execution plan. For more information, see the MySQL Documentation for ANALYZE TABLE Syntax.