AWS Database Blog

Best practices for configuring parameters for Amazon RDS for MySQL, part 1: Parameters related to performance

This blog post was last reviewed or updated May, 2022.

With Amazon Relational Database Service (Amazon RDS) for MySQL, you can deploy scalable MySQL servers in minutes with cost-efficient and resizable hardware capacity. Amazon RDS frees you up to focus on application development by managing time-consuming database administration tasks including backups, software patching, monitoring, scaling, and replication.

Most Amazon RDS for MySQL users don’t need to change the values of the over 400 available server configuration parameters. However, if you want to customize the default configuration provided by Amazon RDS, you can use the AWS Management Console or AWS CLI to create custom parameter groups and apply them to your database instances.

Note: The default values mentioned in this post apply to Amazon RDS for MySQL 5.7. You can find the default values for Amazon RDS for MySQL 8.0 in the corresponding MySQL documentation or in the RDS for MySQL 8.0 default parameter group which you can access using the AWS CLI or the Amazon RDS console.

Some parameter setting basics

You can set parameters globally using a parameter group. Alternatively, you can set them for a particular session using the SET command. From a parameter group, the parameters are set globally and for all sessions. In contrast, you can use the SET command to set a parameter value only for a particular session. In doing so, you precede the variable’s name by SESSION, @@session, or @@. An example of such a set statement is SET sort_buffer_size=10000; or SET @@local.sort_buffer_size=10000;.

If no modifier is present, SET changes the session variable. If you use the default parameter group for your instance, you first need to create a custom parameter group and attach it to your instance. For more details on using custom parameter groups, see the AWS Support article How do I modify the values of an Amazon RDS DB parameter group?

Parameters can be either static or dynamic. A static parameter change needs an instance restart to take it into effect. Dynamic parameter changes take effect online without any restart and thus can be changed.

Dynamic parameters can have a session level or global scope. A variable with global scope means that it has an impact in the whole server and for all the sessions. In contrast, a variable with session scope is effective only for the session where it is set. Some variables have both global and session scope. In such cases, the global value is default for the session scope. Any global change to a parameter that also has a session scope only affects new sessions. Either you need to disconnect and reconnect, restart the database, or change the session setting for the current connection.

To query the current value of any variable, use the show variable command. The following is an example.

show variables like "max_connections";

This command returns a result set like the following, showing you what your current maximum connections setting is.

+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 100   |
+-----------------+-------+

If you don’t specify the keyword global in the show command as shown preceding, the result displays the session level value. For example, to see the session-level value, use show session variables like "max_connections";. To see the global value, use show global variables like 'max_connections';.

In some cases, you might experience operational issues due to misconfiguration of Amazon RDS for MySQL parameters. This blog series aims to provide guidance for some of the most commonly modified database parameters to improve the reliability and stability of your database instances.

Best practices for configuring parameters related to performance

Following, I list parameters related to performance, with best practice suggestions for configuring each.

innodb_buffer_pool_size

This parameter decides the size in bytes of the buffer pool, the memory area where InnoDB caches table and index data. This parameter is one of the most important settings in your MySQL instance, and often 80 percent or more of memory is allocated here.

The default value for this parameter in RDS is {DBInstanceClassMemory*3/4}. However, in many cases, depending on the workload you might need to tune this value.

For example, suppose that you have 100 gigabytes of data but your application accesses only 1 gigabyte of the data regularly. Then having only a few gigabytes in the buffer pool is enough. On the other hand, if you have 10 gigabytes of data and the application accesses all of it constantly, you want to have a large enough buffer pool to cover the data and index size. Therefore, we recommend that you decide this parameter’s value depending on your application requirements and workload pattern.

You can check how well the server is using the buffer pool using the command show engine innodb status \G. If you see consistently from the output that you have a large percentage of free buffers over time, even during peak workload, then the buffer pool size allocated might be too large.

In the same output of show engine innodb status \G , you might see messages like evicted without access X.XX/s. If the evicted without access value is not zero, then it means data is getting read into the buffer pool and then pushed out again before it is accessed (also called buffer pool churn). If this value is tracked over time and you see it to go up, then you might need to increase the innodb_buffer_pool_size value. However, we don’t recommend that you set innodb_buffer_pool_size too large. If you do, that might cause paging in the operating system and can have a negative impact on performance. If you need to increase the innodb_buffer_pool_size further, consider scaling up to a larger instance to get larger total memory and then allocating innodb_buffer_pool_size.

The performance of InnoDB depends on whether data is present in the cache. Whenever there is a cache miss, it causes a random I/O request to disk, resulting in very slow operation.

You can run the following SQL statement twice over a period of time to get the InnoDB buffer pool miss rate.

show global status where variable_name in ('Innodb_buffer_pool_reads','Uptime');

You can calculate the miss rate per second with the following formula .

Miss_rate = delta(Innodb_buffer_pool_reads)/delta(Uptime)

In some cases, the growth of the status variable innodb_buffer_pool_reads decreases and the status variable innodb_buffer_pool_pages_free doesn’t increase compared to the previous period after increasing innodb_buffer_pool_size. In these cases, we should expect better overall InnoDB performance from increasing innodb_buffer_pool_size.

innodb_log_file_size

This parameter determines the fixed size for MySQL’s redo logs. Tuning this value affects the crash recovery time and also overall system performance. The default value is 134,217,728 (about 128 MB).

The default value for innodb_log_file_size might be small for your workload. We recommend increasing this value for any high change rate workload. For substantial insert, update, and delete activity, the recommended initial setting is as follows.

innodb_log_file_size = 600M
innodb_log_files_in_group = 2

The innodb_log_files_in_group parameter defines the number of log files in the log group. Higher values than 2 for innodb_log_files_in_group produce no significant benefit. The maximum permitted value for innodb_log_file_size * innodb_log_files_in_group is 512 gigabytes from MySQL version 5.6 onwards. Therefore, if you increase innodb_log_files_in_group to greater than 2, you must reduce innodb_log_file_size to remain within the limit.

We recommend that you increase innodb_log_file_size in MySQL 5.6 and later—it’s 128 MB by default. The increase means that log flushes are less frequent. Otherwise, flushes are frequent during data import, for example multiple times within five minutes.

However, the larger the value of innodb_log_file_size (the file size), the longer the crash recovery time. So it has to be optimally sized to reduce the number of I/O but not to affect the recovery time by a large extent.

To optimally size this parameter, run these queries at your server’s peak usage time:

mysql> pager grep sequence
PAGER set to 'grep sequence'
mysql> show engine innodb statusG select sleep(60); show engine innodb statusG
Log sequence number 82 3836410821
1 row in set (0.06 sec)
1 row in set (1 min 0.00 sec)
Log sequence number 84 3838334625
1 row in set (0.05 sec)

From the preceding query, we can see the total number of bytes written to the transaction log. So, now we can see how many MB have been written to the log in one minute.

mysql> select (3838334625 - 3836410821) / 1024 / 1024 as MB_per_min;
+------------+
| MB_per_min |
+------------+
| 1.83468246| 
+------------+

Note: As a rough rule of thumb, you can make the log big enough to hold at most an hour or so of redo. Also, innodb_log_file_size has to be at least 10 times the size of the largest BLOB in the database.

innodb_change_buffering

The change buffer is a special data structure used for caching changes in secondary index pages that are not in the buffer pool. The innodb_change_buffering parameter helps in reducing the substantial I/O operations used to keep secondary indexes up-to-date after data manipulation language (DML) operations. This parameter is used to control the extent of change buffering operations.

The following table shows the allowed values for this parameter.

Value Description
none Don’t buffer any operations.
inserts Buffer insert operations.
deletes Buffer delete marking operations—strictly speaking, the writes that mark index records for later deletion during a purge operation.
changes Buffer inserts and delete-marking operations.
purges Buffer the physical deletion operations that happen in the background.
all This value is the default. Buffer inserts, delete-marking operations, and purges.

The default value is all. We recommend that you disable change buffering by setting the value to none because all causes extremely long shutdown times for upgrades. This feature was useful in the past with slow disks, but isn’t useful now.

innodb_io_capacity

This parameter controls the maximum number of I/O operations per second that InnoDB can perform for InnoDB background tasks. Some examples of these tasks are flushing pages from the buffer pool and merging data from the change buffer, as described in the MySQL documentation.

The default value for this parameter is 200, and allowed values are in the range 100–18,446,744,073,709,551,615. For I/O-intensive systems, a value of 1,000 usually works. We don’t recommend an extreme value like 20,000 unless you have already proved that a lower value is not sufficient. You might use such a value when flushing is too slow and you have IOPS to spare.

To see if flushing is slow, you can check the percentage of dirty pages using the following methods. A very high percentage of dirty pages shows that flushing isn’t happening as fast as expected. To see the amount of dirty data in the buffer pool, you can use the Innodb_buffer_pool_bytes_dirty metric with SHOW GLOBAL STATUS output. The following shows an example.

mysql> SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_%';
+---------------------------------------+-------------+
| Variable_name                         | Value       |
+---------------------------------------+-------------+
...
| Innodb_buffer_pool_pages_data         | 3271        |
| Innodb_buffer_pool_bytes_data         | 38192680    |
| Innodb_buffer_pool_pages_dirty        | 1024        |
| Innodb_buffer_pool_bytes_dirty        | 16760878    |
...
| Innodb_buffer_pool_pages_total        | 5395        |
...
+---------------------------------------+-------------+
10 rows in set (0.01 sec)

In the preceding output, the variables are as follows:

  • Innodb_buffer_pool_pages_data and Innodb_buffer_pool_bytes_data: the amount of buffer in the buffer pool.
  • Innodb_buffer_pool_pages_dirty and Innodb_buffer_pool_bytes_dirty: the amount of dirty buffer in the buffer pool.
  • Innodb_buffer_pool_pages_total: the total number of pages in the buffer pool.

To calculate the current percentage of dirty pages, use the following formula:

                            Innodb_buffer_pool_pages_dirty
Dirty page percentage = ------------------------------------- x 100%
                            Innodb_buffer_pool_pages_total

Also, if you use information_schema, you can use the following query.

mysql> SELECT dirty.Value AS 'Dirty Pages', total.Value AS 'Total Pages', ROUND(100*dirty.Value/total.Value, 2) AS 'Dirty Pct' FROM (SELECT VARIABLE_VALUE AS Value FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'Innodb_buffer_pool_pages_total') AS total INNER JOIN (SELECT VARIABLE_VALUE AS Value FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'Innodb_buffer_pool_pages_dirty') AS dirty;
+-------------+-------------+-----------+
| Dirty Pages | Total Pages | Dirty Pct |
+-------------+-------------+-----------+
| 200        | 1000        |     20.00 |
+-------------+-------------+-----------+
1 row in set (0.00 sec)

You can also get the dirty-page percentage from show engine innodb status using the following formula.

                                   Modified DB pages
Dirty page percentage = ------------------------------------- x 100%
                            Database pages + Free buffers

Following is the corresponding information schema query.

mysql> SELECT MODIFIED_DATABASE_PAGES, DATABASE_PAGES, FREE_BUFFERS, ROUND(100*MODIFIED_DATABASE_PAGES/(DATABASE_PAGES+FREE_BUFFERS), 2) AS 'Dirty Pct' FROM information_schema.INNODB_BUFFER_POOL_STATS;
+-------------------------+----------------+--------------+-----------+
| MODIFIED_DATABASE_PAGES | DATABASE_PAGES | FREE_BUFFERS | Dirty Pct |
+-------------------------+----------------+--------------+-----------+
|                     207 |           4333 |          599 |     4.19  |
+-------------------------+----------------+--------------+-----------+
1 row in set (0.00 sec)

We recommend that you increase or decrease innodb_io_capacity if you can’t get the dirty or changed page percentage to increase to your desired value. This parameter controls the flushing rate and related disk I/O. However, you can seriously harm performance if this parameter or innodb_io_capacity_max is set too high. Also, you can waste disk IOPS with premature flushing.

The optimum value for this parameter depends on workload and data access pattern. With a DML-intensive workload, if this parameter’s value is set too low, flushing can fall behind if a very high percentage of dirty pages causes excessive memory usage. Similarly, setting this parameter’s value too high can cause excessive premature flushing and high disk writes due to flushing and saturating the IOPS/throughput capacity.

innodb_io_capacity_max

This parameter defines the maximum value to which InnoDB is allowed to extend the setting of innodb_io_capacity if flushing activity falls behind. If you don’t specify this value, the default is twice the value of innodb_io_capacity, with a minimum value of 2,000. The default value is 2,000, and allowed values are 2,000–18,446,744,073,709,547,520.

We recommend as a best practice that you use smaller values for systems with low write loads and larger values for systems with high write activity. This value can never be lower than innodb_io_capacity, and often twice the value of innodb_io_capacity is a good choice. In case of adaptive flushing, this is the variable that actually limits the flushing rate.

innodb_purge_threads

This parameter defines the number of background threads that MySQL uses for the innoDB purge operation. This parameter has a minimum value of 1, so the purge operation is always performed by a background thread and never as a part of the master thread. Using one separate thread to perform innoDB purge operations helps reduce internal contention with InnoDB and also improves scalability. The default value is 1 in RDS MySQL, and the maximum value is 32.

Having multiple purge threads can improve efficiency in purging undo records for DML-intensive workloads if the DML operations are spread across multiple tables. If the DML operations are concentrated on a single table or very few tables, keep this value as low as possible to avoid contention among purge threads. The recommended value is 1 unless there is a strong reason to accelerate the purge operation for DML operations running on multiple tables. Higher values can cause purge thread contention, as seen in this MySQL bug.

innodb_stats_persistent

This parameter specifies whether the InnoDB table and index statistics produced by the ANALYZE TABLE command are stored on disk. These statistics are stored in the mysql.innodb_table_stats and mysql.innodb_index_stats tables. If the statistics aren’t stored, they need to be recalculated frequently, after every restart that can cause overhead.

Available values are 1 (ON) or 0 (OFF). By default, this parameter is turned ON. By enabling this parameter, you get optimizer statistics that are persisted across instance restart. Thus, it allows greater plan stability and more consistent query performance.

This parameter enables persistent statistics at the global level. However, this persistence also can be enabled or disabled at table level by using the STATS_PERSISTENT clause while creating a table.

If this parameter is set to OFF, the optimizer statistics becomes nonpersistent and need to be recalculated after each instance restart at next table access. In this state, different estimates can be produced when recomputing statistics and can generate different execution plans. Therefore, for a consistent, stable, and optimized execution plan, we recommend that you turn this parameter ON.

With persistent statistics, you should periodically update statistics by running the ANALYZE TABLE command. For tables with stable or gradually changing data, it might be good to analyze weekly or monthly. For tables with smaller size and rapidly changing data, we recommend running ANALYZE TABLE more frequently. The only disadvantage of setting persistent statistics is the need to run ANALYZE TABLE sometimes and to write statistics data to disk. But this overhead is minimal and avoids costly statistics recomputation every time a table is accessed after instance restart.

innodb_thread_concurrency

You can use this parameter to help solve a problem that arises if the number of active threads is significantly greater than the number of vCPUs.

For a small instance size, you might be able to process a large number of sessions if most of the sessions are idle. However, the number of active threads can become significantly greater than the number of vCPUs. If this happens, the operating system tries to divide all the available vCPUs to available threads. This is done by a process called context-switching, which itself adds significant overhead in CPU usage. You can see this overhead in the Systems category from the Enhanced Monitoring console. For small instance classes, you can overcome this problem by scaling up to larger instances, but for large instances this option is not available.

At the operating-system level, Linux does not know the work done by each thread. However, InnoDB knows about it and can prioritize the threads accordingly. It does this prioritization by using the innodb_thread_concurrency parameter, which limits the number of concurrent threads. The default value of this parameter is 0, which is interpreted as infinite concurrency (no concurrency checking).

After the number of executing threads reaches this limit, additional threads sleep for a number of microseconds before going into the queue. You set the number of microseconds for threads to sleep by using the configuration parameter innodb_thread_sleep_delay. The innodb_thread_concurrency parameter helps to avoid context switching for large instance sizes.

Another use case of tuning this parameter is for highly concurrent inserts where they all need to write. In such cases, more thread concurrency helps. To decide the optimum value of this parameter, test against a given workload. Often the default value is the best.

innodb_sync_array_size

You can use this parameter to split the internal data structure used to coordinate threads. Doing this is especially helpful for highly concurrent workloads where large numbers of threads are waiting. The enhanced monitoring metric loadAverageMinute is a good indicator of waiting threads. High load averages imply that a system is overloaded; many threads are waiting for CPU time. When the load average values are high, above the number of vCPUs, it signifies high demand for the vCPUs.

We recommend that you increase the value for highly concurrent workloads where a large number of threads is frequently found waiting. The default value of the innodb_sync_array_size parameter is 1. Such a low value often causes a lot of contention for large concurrent workloads. However, the optimum value of this parameter depends on the number of concurrent connections expected and has to be decided by thorough load testing. Often innodb_sync_array_size=16 provides a good result.

innodb_flush_log_at_trx_commit

To achieve durability of any transaction, the log buffer has to be flushed to durable storage. However, writing to disk has an impact on performance. If performance is given more preference over durability in your system, you can tune the innodb_flush_log_at_trx_commit parameter to control how frequently the log buffer is flushed to disk.

Possible settings are as follows:

0 – This setting does nothing at transaction commit but writes the log buffer to the log file and flush the log file every second. Although the OS tries to flush the log, this flush is not guaranteed. Therefore, in this case you should be able to afford the loss of some of the latest committed transactions if a crash occurs. Data loss also refers here to any potential loss of data, not just transactions. Thus, this setting can be a source of potential corruption.

1 – This setting writes the log buffer to the log file and flushes it to durable storage every time a transaction is committed. This is the default (and safest) setting. It guarantees that you don’t lose any committed transactions unless the disk or operating system “fakes” the flush operation.

2 – This setting writes the log buffer to the file at every commit operation, but it doesn’t flush the buffer. This setting flushes data to disk only once per second. Although the OS tries to flush the log, this flush is not guaranteed.

The most important difference between this and the 0 setting (and what makes 2 the preferable setting) is that setting to 2 means that no transactions are lost if the MySQL process crashes. However, you can still lose transactions if the entire server crashes or loses power. This is the same data loss as if the crash is caused by the OS. Similar to the value 0, this setting can potentially cause corruption, and data loss is not only loss of transactions.

Unless the value is set to 1, InnoDB doesn’t guarantee ACID properties. Up to a second’s worth of the most recent transactions can be lost if a crash occurs. Thus, be aware that in case of a crash, using this parameter can break read replica operation or cause data loss in a snapshot. In a replication scenario, to avoid these issues in case of a crash don’t set this parameter on the master. For RDS best practice, we recommend that you have separate parameter groups for the replication master and its replicas.

Note: Writing the buffer to the log simply moves the data from InnoDB’s memory buffer to the operating system’s cache, which is also in memory. It doesn’t actually write the data to durable storage. Flushing the log to durable storage means that InnoDB asks the operating system to actually flush the data out of the cache and ensure it’s written to the disk.

tmp_table_size and max_heap_table_size

As the MySQL documentation says: “If an internal temporary table is created as an in-memory table but becomes too large, MySQL automatically converts it to an on-disk table. The maximum size for in-memory temporary tables is determined from whichever of the values of tmp_table_size and max_heap_table_size is smaller.” Operations involving on-disk tables are significantly slower than in-memory temporary tables. You can find a similar discussion in the documentation for MySQL 5.7. The default value for both of these parameters is 16,777,216 bytes.

We often come across issues when a customer running a complex query that takes a long time (typically 5–15 minutes) sees copying to tmp table on disk from show full processlist. This kind of issue often come with symptoms like large write IOPS and disk queue depth (over 100). To see creation on memory and disk temp tables, before and during finishing running the preceding query for comparison, try show status like 'Created_tmp%' as shown following.

mysql> SHOW GLOBAL STATUS LIKE 'Created_tmp%'; 
+-------------------------+--------+ 
| Variable_name           | Value  | 
+-------------------------+--------+ 
| Created_tmp_disk_tables | 582    | 
| Created_tmp_files       | 5      | 
| Created_tmp_tables      | 188654 | 
+-------------------------+--------+ 
3 rows in set (0.01 sec) 

In such cases, increasing the variables for both tmp_table_size and max_heap_table_size to 64 MB, 128 MB, or even 512 MB resolves the problems. We recommend that you compare performance before and after this change. The tmp_table_size value decides the maximum size an in-memory internal temporary table can be. However, it’s the minimum of max_heap_table_size and tmp_table_size that determines the actual maximum size on an in-memory temporary table. Often, we only configure tmp_table_size, but actually they must be increased together.

For example, suppose that we have the following.

tmp_table_size      = 64 MB
max_heap_table_size = 32 MB

In this case, MySQL assigns to tmp_table_size the value of 32 MB and to max_heap_table_size the value of 32 MB.

Suppose that we have the following.

tmp_table_size      = 32 MB
max_heap_table_size = 64 MB

In this case, MySQL also assigns to tmp_table_size the value of 32 MB and to max_heap_table_size the value of 32 MB.

The code can be found in the file sql_select.cc.

if (thd->variables.tmp_table_size == ~ (ulonglong) 0)        // No limit
    table->s->max_rows= ~(ha_rows) 0;
  else
    table->s->max_rows= (ha_rows) (((table->s->db_type == DB_TYPE_HEAP) ?
                                    min(thd->variables.tmp_table_size,
                                        thd->variables.max_heap_table_size) :
                                    thd->variables.tmp_table_size)/
                                   table->s->reclength);

Following we can find the min() function used from MySQL.

min(thd->variables.tmp_table_size,thd->variables.max_heap_table_size)

The manual state MySQL takes the lower value assigned to the two variables and uses that as the good value for tmp_table_size. Thus, we recommend that tmp_table_size and max_heap_table_size should be the same value.

However, before increasing these values, check if your database is indexed properly, specially for joining and group by columns. Proper indexing reduces temp table creation. Otherwise, simply increasing these parameters values might allow inefficient queries to run without indexes and create more temp tables than necessary.

Some conditions prevent the use of an in-memory temporary table, in which case the server uses an on-disk table instead, which increases write I/O. These conditions are the following:

  • Presence of a BLOB or TEXT column in the table.
  • Presence of any string column in a GROUP BY or DISTINCT clause larger than 512 bytes for binary strings or 512 characters for nonbinary strings. (Before MySQL 5.6.15, the limit is 512 bytes regardless of string type.)
  • Presence of any string column with a maximum length larger than 512 (bytes for binary strings, characters for nonbinary strings) in the SELECT list, if UNION or UNION ALL is used.
  • The SHOW COLUMNS and the DESCRIBE statements use BLOB as the type for some columns, thus the temporary table used for the results is an on-disk table.

foreign_key_checks

Often data import process becomes slower when foreign key checks are enabled. The foreign_key_checks parameter is enabled by default, and default value of this parameter is 1 (ON). However, it is not available in the parameter group. If your goal is to ensure performance rather than data integrity, to improve import performance you can turn off this parameter by setting it to 0 before running the import SQL statements.

The following shows an example.

SET foreign_key_checks=0;
... SQL import statements ...
SET foreign_key_checks=1;

Remember to append the SET FOREIGN_KEY_CHECKS=1; at the end of the import file.

For big tables, doing this can eliminate a lot of disk I/O. The potential problem is that any data inconsistency that would’ve made the foreign key check fail during import is still present in the database even after the foreign keys are turned back on.

We offer a couple of wrapped stored procedures for enabling and disabling the global setting dynamically. The names of the procedures are mysql.rds_set_fk_checks_on and mysql.rds_set_fk_checks_off. These procedures are available in RDS MySQL 5.6 versions 5.6.29 and greater, and RDS MySQL 5.7 versions 5.7.17 and greater.

The following shows an example of using these procedures.

CALL mysql.rds_set_fk_checks_off();
CALL mysql.rds_set_fk_checks_on();

UNIQUE_CHECKS

You can also greatly improve insert performance for large tables by running SET UNIQUE_CHECKS=0 to disable UNIQUE_CHECKS before running INSERT for the data. Then we run SET UNIQUE_CHECKS=1 when the insert is done. UNIQUE_CHECKS is also enabled by default, and the default value of this parameter is 1 (ON), although it is not available in the parameter group.

For example, to do this you add these lines at the top of the file:

SET FOREIGN_KEY_CHECKS = 0;
SET UNIQUE_CHECKS = 0;
SET AUTOCOMMIT = 0;
You add these at the end:

SET UNIQUE_CHECKS = 1;
SET FOREIGN_KEY_CHECKS = 1;
COMMIT;

Doing this helps avoid a large amount of disk access. Ensure that the data involved contains no duplicate keys.

query_cache_size

The default value of query_cache_size is 1,048,576 (bytes). In most cases, configuring the query cache causes performance degradation. The query cache is so problematic that it was deprecated in MySQL 5.7.20 and removed in MySQL 8.0. Therefore, we recommend not to use it.

To verify if the query cache is on, you can use two variables:

  • query_cache_size: This value is the amount of memory allocated to the query cache. If the query_cache_size is 0, the query cache is effectively switched off. However, there is still some overhead unless the query_cache_type is set to OFF.
  • query_cache_type: There are three valid values for the query_cache_type setting, and 0 is required to turn OFF the query cache:
    • 0 or OFF: This turns off the query cache is completely. However, to also deallocate the buffer, set query_cache_size to 0.
    • 1 or ON: The query cache is used by default. To disable the query cache for a query, use the SQL_NO_CACHE hint.
    • 2 or DEMAND: In this case, the query cache is used only when the SQL_CACHE hint is specified.

The following is an example.

mysql> SHOW VARIABLES LIKE 'query%';
+------------------------------+---------+
| Variable_name                | Value   |
+------------------------------+---------+
| query_alloc_block_size       | 8192    |
| query_cache_limit            | 1448576 |
| query_cache_min_res_unit     | 4096    |
| query_cache_size             | 0       |
| query_cache_type             | OFF     |
| query_cache_wlock_invalidate | OFF     |
| query_prealloc_size          | 8192    |
+------------------------------+---------+
7 rows in set (0.00 sec)

With the output preceding, the query cache is disabled. We can see query_cache_size is 0 and query_cache_type is OFF.

optimizer_switch

As the MySQL documentation says, the optimizer is the set of routines that decide what execution path the DBMS should take for queries. The optimizer is responsible for selecting the most efficient query plan to get your results. This parameter is used to control the behavior of optimizer and is composed of a set of flags. By turning these flags ON or OFF, we can control the optimizer’s behavior.

The following is an example.

mysql> SELECT @@optimizer_switch\G
*************************** 1. row ***************************
@@optimizer_switch: index_merge=on,index_merge_union=on,
                    index_merge_sort_union=on,
                    index_merge_intersection=on,
                    engine_condition_pushdown=on,
                    index_condition_pushdown=on,
                    mrr=on,mrr_cost_based=on,
                    block_nested_loop=on,batched_key_access=off,
                    materialization=on,semijoin=on,loosescan=on,
                    firstmatch=on,
                    subquery_materialization_cost_based=on,
                    use_index_extensions=on

The default values are the following: index_merge=on, index_merge_union=on, index_merge_sort_union=on, index_merge_intersection=on, engine_condition_pushdown=on, index_condition_pushdown=on, mrr=on, mrr_cost_based=on, block_nested_loop=on, batched_key_access=off, materialization=on, semijoin=on, loosescan=on, firstmatch=on, subquery_materialization_cost_based=on, use_index_extensions=on

Default values are generally recommended. For a bad query plan or wrong estimation by the optimizer, you might need to adjust these flags. For example, you can mitigate performance regression in queries after upgrading from 5.6 to 5.7 by disabling the optimizer_switch flags condition_fanout_filter and derived_merge. Thoroughly test your workload before enabling these flags, because they might have a significant performance impact.

Innodb_read_io_threads and Innodb_write_io_threads

These parameters are the number of I/O threads for read and write operations, respectively, in InnoDB. Both of them have default value of 4. Increasing these values increase the threads for the specific InnoDB operation. Tuning them and setting innodb_read_io_threads = 16 and innodb_write_io_threads = 4 for heavy OLTP workloads can help your instance keep up.

Although most workloads aren’t affected by these, perform proper testing before production implementation. Sometimes contention can happen for disk resources between the foreground threads and these threads.

innodb_status_output_locks

This parameter enables or disables the InnoDB lock monitor. Set it to 1 to turn on and to 0 to turn off the monitor. The default value is 0. When enabled, the parameter displays additional information about locks in SHOW ENGINE INNODB STATUS output and also prints periodic output to the MySQL error log. This parameter was introduced in MySQL 5.7. It helps in identifying locks, particularly in cases where no other session is accessing the locked rows and thus there is no conflict.

table_open_cache

This parameter defines the number of table definitions that can be stored in the definition cache. The default value is 2,000.

As a rule of thumb, set this parameter to a value large enough to keep most of the tables being used open all the time. Opening and closing tables significantly slows operation. You might start with a value twice max_connections or twice the total number of tables and then tune from there.

After the system is up and running for few hours and hot, check whether the allocation is sufficient by using the following command:

mysql> show global status like 'Open_tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Open_tables   | 200   |
+---------------+-------+
1 row in set (0.04 sec)

mysql> show global status like 'Opened_tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Opened_tables | 5000 |
+---------------+-------+
1 row in set (0.00 sec)

From the increase in Opened_tables, we can estimate how many table definitions should be added. We recommend that you repeat the checks until the number of tables being opened per second is no more than a few.

If your number of Opened_tables has not yet reached the value of table_open_cache_size, and the server has been up a while, you can look at decreasing the value. You typically want to see less than one table opened in a second.

To evaluate whether you can use a higher table_open_cache value, look at the number of opened tables in conjunction with the server uptime.

thread_cache_size

This is a configuration variable that defines how many connection handling threads should be cached for reuse, instead of releasing and reallocating them. Thread creation and destruction, which happens at each connect and disconnect, can be expensive.

You can estimate the efficiency of the thread cache by understanding how many threads are being created per second using the Threads_Created value. Then you can sample this over a known period of time. The default value of thread_cache_size is 14. Usually this should be set to at least 16. If an application has large jumps in the number of concurrent connections and Threads_Created is growing quickly, this variable should be increased to a higher value. The goal is not to have threads created in normal operation.

The following is an example.

mysql> show global status like 'threads%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Threads_cached    | 8     |
| Threads_connected | 1     |
| Threads_created   | 20    |
| Threads_running   | 1     |
+-------------------+-------+
4 rows in set (0.00 sec)

max_seeks_for_key

This value is the limit of the assumed maximum number of seeks when looking up a row based on a key. The default value is 18,446,744,073,709,551,615. This value is typically used when queries are not using indexes, even though index scan is cost-efficient, and you do a full scan. In such cases, this value is set lower (such as to 100) and we can force MySQL to prefer indexes instead of table scans. This value is applied to all the queries, so changing this at global level can have unexpected side effects to other queries. Thus, thoroughly test it.

Conclusion

The preceding parameters are the most important ones to pay attention to get the best performance from your RDS MySQL instance. You can tune them to fit your workload and the best practices mentioned.

In the next part of this blog series, I discuss the most commonly used parameters to optimize and stabilize RDS MySQL replications and best practices for tuning them.


About the author

Saikat Banerjee is a cloud support engineer with Amazon Web Services.