AWS Database Blog
Introducing Amazon RDS for MariaDB 10.11 for up to 40% higher transaction throughput
MariaDB is a popular open-source high performance database. Amazon Relational Database Service (Amazon RDS) for MariaDB supports multiple major engine versions including 10.4, 10.5, 10.6. Today, Amazon RDS has announced support for MariaDB major version 10.11, which is the latest long-term supported major version from the MariaDB community.
When compared to previous versions, Amazon RDS for MariaDB 10.11 delivers up to 40% higher transaction throughput. This means faster database operations and improved application responsiveness, especially during periods of peak usage.
In this post, we benchmark Amazon RDS for MariaDB version 10.11 and compare it to our last release RDS for MariaDB 10.6. We also go over the latest features introduced by the community in MariaDB 10.11. Whether you are currently using an older version of MariaDB, or, considering migration from another database system, this post outlines the benefits and procedure of upgrading to MariaDB 10.11.
Performance of Amazon RDS for MariaDB 10.11
We start by comparing Amazon RDS for MariaDB versions 10.11.4 and 10.6.14 to understand the performance improvement of major version 10.11 over previous versions. Our benchmarking tests were carried out using an r6g.16xlarge instance with 45,000 Provisioned IOPS and 1,500 GB EBS volume. We used sysbench to benchmark write and read operations on both 10.11 and 10.6. The dataset employed for testing consisted of two hundred and fifty tables, each containing approximately 1.3 million rows.
First, we examined the write performance using the default parameters of an RDS for MariaDB instance. For the same workload, we varied the number of threads to understand the impact on write performance. The graphs below illustrate the differences in write performance observed across Amazon RDS for MariaDB 10.11.4 and Amazon RDS for MariaDB 10.6.14. We start observing the performance difference between 10.11.4 and 10.6.14 even when running a single thread. As we increase the number of threads, the performance differential continues to grow, peaking at 512 threads. At 512 threads, Amazon RDS for MariaDB 10.11 delivered a write throughput of 3,224.53 transactions per second (TPS), which is 47.25% higher than the MariaDB 10.6 TPS at 2,189.77.
We used the following commands for our sysbench write tests:
Next, we tested a read workload using the default parameter configuration. We start observing a performance improvement on 10.11 when our tests scales to 2 threads. Similar to our write tests, as the number of threads increases, the performance gap between Amazon RDS for MariaDB 10.11.4 and 10.6.14 continues to increase, peaking at 512 thread count. At 512 threads, Amazon RDS for MariaDB 10.11 delivered a read throughput of 1255 transactions per second (TPS), which is 15.13% higher than the MariaDB 10.6 TPS at 1090.16.
We used the following commands for our sysbench read tests:
If you want to push the performance of your RDS for MariaDB 10.11 further, you could consider using Amazon RDS Optimized Reads and Optimized Writes. RDS Optimized Reads use NVMe-based SSD block storage, directly attached to the host server, to achieve up to 2X faster complex query processing. And, RDS Optimized Writes deliver up to 2X higher write transaction throughput. When you combine these features with Amazon RDS for MariaDB 10.11, the overall performance gains can be compounded depending on the nature of your workload.
In the next section, we go over the new features introduced with MariaDB 10.11. If you would like to learn the best practices for upgrading your database instance to MariaDB 10.11, you can skip ahead to section “Upgrading to Amazon RDS for MariaDB 10.11”.
What’s new in Amazon RDS for MariaDB 10.11?
The MariaDB community has introduced multiple new features with major version 10.11. You can find a complete list of new features in the RDS for MariaDB user guide. In this section, we highlight a select few.
1. Lag free ALTER TABLE replication
Performing ALTER TABLE operations on a primary database often introduces replication lag, affecting the availability and consistency of replicas. MariaDB 10.11 eliminates the replication lag caused by ALTER TABLE. When you enable this feature by enabling the binlog_alter_two_phase
parameter in the parameter group, the ALTER command is replicated and run simultaneously on replicas, which minimizes the replication lag and ensures a seamless and uninterrupted database experience, allowing you to perform schema changes without worrying about large replication delays.
In addition to enabling the binlog_alter_two_phase
parameter in the parameter group, it’s important to ensure that parallel replication is enabled on the replicas and increase the value of the slave_parallel_workers
parameter greater than zero.
2. Authentication
Amazon RDS for MariaDB 10.11 supports the GRANT TO …PUBLIC
command which allows granting privileges to all users with access to the server, including those created after the privileges are granted. We recommend you use this feature cautiously as granting high level privileges to PUBLIC can lead to security issues in the database. If used cautiously, this approach proves particularly useful when you want to define a specific set of privileges for all users without having to repeat the individual grant statements for every user. This allows all users to have a uniform set of privileges, promoting consistency and standardization of privilege management. Administrators can easily review and modify the privileges assigned to all users at once. The GRANT TO … PUBLIC
feature can help administrators streamline user privilege management.
In the following example, we create a MariaDB user named `John`
who is allowed to connect to our RDS instance.
Now, we use the GRANT TO …PUBLIC command to grant SELECT privileges to all users with access to the server on the `Titles`
table in the `Employees`
database. We will later run the SHOW GRANTS command to confirm the privileges assigned TO PUBLIC grants.
Now, we login as the newly-created MySQL user, John, and query the table in the database.
From the previous command, we confirm that MariaDB user `John`
does not have privileges to run the SELECT query on the `Titles`
table in the `Employees`
database. However, since we created a GRANT TO … PUBLIC SELECT privilege in our earlier step, SELECT privilege is granted to the MariaDB user and this is confirmed in this example:
3. New JSON functions
MariaDB 10.11 provides several new functions for additional flexibility and functionality when working with JSON data, including JSON_EQUALS, JSON_NORMALIZE, JSON_OVERLAPS:
JSON_EQUALS: You can use this function to compare the equality between JSON objects regardless of attribute order.
In the following example, we use the JSON_EQUALS function to compare the equality between two JSON objects, '{"John":[10563,22,1988],"Sam":[10567,34,1995]}'
and '{"Sam":[10567,34.0,1995],"John":[10563,22,1988]}'
.
A value of 1 indicates that the JSON objects are equal, whereas a value of 0 indicates inequality.
In our example, both JSON objects have the same structure, with the same keys and values (though some values are in different data types, for example 34 to 34.0). Since, our JSON objects have the same structure and members, regardless of member order, the function returns a value of 1, indicating equality.
In the next example, we use the JSON_EQUALS function again to compare a different set of JSON objects. These JSON objects are not equal because they have different structures. For example, in the first object, the key “John” has an array with the value 1988, while in the second object, the key “John” has a different structure with only two values. Therefore, the function returns a value of 0, indicating inequality.
JSON_NORMALIZE: You can use the JSON_NORMALIZE function in combination with a unique key to enforce a unique constraint on the JSON contents in a database. This function converts a JSON object into a canonical form that can be used for comparison purposes for JSON data. This ensures that any future inserts or updates with a similar JSON value that is already present in the table will result in a constraint violation error.
Let’s understand this with an example. We create a table named ‘bikemodel’
with two columns: `bike_id`
(an auto-incrementing primary key) and `model`
(a column to store JSON data)
Here, an ALTER command is run on the `bikemodel` table. A new column named model_normalize is added to store the normalized JSON form of the model column using the JSON_NORMALIZE function. Additionally, a unique key is added on the model_normalize column to enforce the uniqueness constraint on the normalized JSON values.
We now insert a JSON object into the bikemodel table. The JSON object has attributes like name, mode, and color.
Now, we attempt to insert a JSON object into the table with slightly different key order and formatting. However, due to the use of the JSON_NORMALIZE function and the unique key constraint on the `model_normalize`
column, this insertion fails and results in a constraint violation error. This is because the normalized form of the JSON data is the same as the previously inserted JSON object.
JSON_OVERLAPS: You can use this function to compare two JSON documents to see if they have any common key/value pairs between objects. It returns true if the two JSON objects have at least one value for the key in common. This also includes array elements between two arrays, array element with scalar and two scalar documents if they have the same type and value.
In the following example, the first JSON object has the keys “John” and “Year” with their corresponding values, and the second JSON object has the keys “SAM” and “Year” with their corresponding values.
The function returns true (1) because both JSON objects share the common key "Year"
and their corresponding values are not both NULL. This indicates that there is an overlap between the two JSON objects.
In the following example, the first JSON object has the keys "John"
and "Year"
with their corresponding values, and the second JSON object has the keys "JOHN"
and "Year"
with their corresponding values.
The function returns false (0) because although both JSON objects share the common key "Year"
, their corresponding values are different. Therefore, there is no overlap between the two JSON objects.
4. SFORMAT function for arbitrary text formatting
SFORMAT function allows for arbitrary text formatting. With this function, you can create custom formatting templates for your data, which can be useful for generating reports, and structured text formatting. Let’s try an example.
We create a table named `employees`
with columns `employee_id`
, `Name`
, and `Year`
.
We use the SFORMAT function to create custom formatted strings based on the data in the table as follows.
The function replaces the placeholders in the template string with the corresponding values from the employees table and generates formatted strings.
SFORMAT() is a flexible way to format strings by which you can control the format of the output and ensure that the values are formatted correctly.
5. Query analysis with ANALYZE FORMAT=JSON
ANALYZE FORMAT=JSON in MariaDB 10.11 is a powerful tool for query analysis and optimization. By running the command, you can obtain a comprehensive JSON-based analysis of the query run, which includes both the run plan and the actual query run. This includes information like r_loops
(revealing how many times each node in the query plan was run), r_total_time
(showing the total run time spent on each node), and r_buffer_size
(providing insights into buffer utilization of nodes during the query run). This information will help database administrators and developers to fine-tune query performance, identify areas for potential optimization.
ANALYZE FORMAT=JSON <query> will run the statement and prints the information about the query in EXPLAIN FORMAT=JSON format, we recommend please proceed with caution when running this command on the production workloads and during peak business hours as this command will run the query to provide the required metrics.
In the following example, I run ANALYZE FORMAT=JSON on the query.
`Analyze`
shows information about the overall analysis of the query.
`query_optimization`
provides information related to query optimization.
`r_total_time_ms`
shows total time taken for query optimization in milliseconds (0.100549301 ms in this case).
`query_block`
section provides information about the query execution process.
`select_id`
provides information related to Identifier for the SELECT query.
`r_loops`
shows number of execution loops (1 loop in this case).
`r_total_time_ms`
shows total time taken for query execution in milliseconds (0.471070642 ms in this case).
`nested_loop`
subsection indicates the presence of nested loops in the execution process and provides Information about the table used in the query execution.
`r_loops`
shows number of execution loops (1 loop in this case).
`rows`
shows total number of rows in the table (98 rows in this case).
`r_rows`
shows estimated number of rows processed (98 rows in this case).
`r_table_time_ms`
shows time taken to access the table in milliseconds (0.454686291 ms in this case).
`r_other_time_ms`
shows other processing time in milliseconds (0.011758278 ms in this case).
`filtered`
shows percentage of rows that passed through the filter (100% in this case).
`r_filtered`
shows estimated percentage of rows that passed through the filter (100% in this case).
6. Natural sort order
You can use the natural_sort_key()
function to perform natural sorting directly within the database, you achieve your requirements of human-friendly sorting to the database environment. This feature allows developers to obtain results that align with user expectations, making sorting more intuitive and comprehensible. Although natural sort functionality exists in various programming languages through built-in or third-party modules, MariaDB’s natural_sort_key()
function simplifies the process within the database.
In this example, we create a table named `coordinates` with a single column named `value`, and you insert various string values into it.
We are running a query to select values from the `coordinates`
table and sort them using the default sorting order. This default sorting treats numbers in strings as characters and displays the following output.
We are running a similar query, but this time we are using the natural_sort_key()
function to achieve natural sorting. This function treats numeric portions of strings as numbers, resulting in more intuitive sorting that aligns with user expectations.
MariaDB 10.11 also includes the following enhancements which we do not cover in-depth in this post:
- Stored functions now supports IN, OUT and IN OUT parameters qualifiers.
- Cursors now support IN qualifier.
- With the UUID data type you can now store 128-bit UUID data.
- System versioning capabilities have also been enhanced with the addition of the system_versioning_insert_history setting which allows for history modification
Upgrading to Amazon RDS for MariaDB 10.11
You have multiple options to upgrade your database to Amazon RDS for MariaDB 10.11: in-place upgrade, using a read replica and Amazon RDS Blue/Green Deployments.
We recommend using Amazon RDS Blue/Green Deployments to minimize risk and downtime during the database upgrade. This managed capability creates two database environments: your current production environment, the blue environment, and a staging environment, the green environment. These two environments remain in sync with each other using native logical replication, so you may safely test your changes on the staging (green) environment and promote when changes have been validated. During testing, we recommend that you test the application on green environment carefully, as enabling writes can result in replication conflicts leading to unexpected data in production environments.
In the example below, we show you how to upgrade an RDS for MariaDB 10.6 instance to MariaDB 10.11 using Amazon RDS Blue/Green Deployments. You can use the same process to upgrade from any prior RDS for MariaDB version (e.g. 10.3, 10.4, 10.5) to version 10.11.
RDS Blue/Green Deployments support the AWS Command line interface (AWS CLI), as well as the AWS Management Console. For this post, we use AWS CLI commands to upgrade our cluster.
You can set the Blue/Green Deployment identifier and the parameters of your database (e.g., source ARN, engine version, and DB cluster parameter group) to be modified for the staging environment. In the following example, we have a MariaDB 10.6.14 version and we create the Amazon RDS Blue/Green Deployment using the following AWS CLI command:
Amazon RDS Blue/Green Deployments automatically creates a staging environment and run automated tasks to prepare the database. After the Blue/Green Deployment is complete, we have an environment that is ready for testing and promotion, after the validation.
Note that you will be charged for the resources created in the staging environment, including Multi-AZ deployments and any other features such as Amazon RDS Performance Insights that may have been enabled in the production environment.
You can also do the same deployment via the AWS Console by selecting the database and choosing “Create Blue/Green Deployment” on the Actions drop-down menu. For more information, follow the guidance in the New-Fully Managed Blue/Green Deployments in Amazon Aurora and Amazon RDS.
After you validate your applications in the green environment, you can use the following command to switch over the production environment:
Conclusion
In this post, we benchmarked Amazon RDS for MariaDB 10.11, and observed up to 47.25% improvement in transaction throughput compared to MariaDB 10.6. We also discussed new features introduced in Amazon RDS for MariaDB 10.11. Finally, we outlined how to upgrade to Amazon RDS for MariaDB 10.11 with minimal downtime, using Amazon RDS Blue/Green Deployments. We encourage you to try Amazon RDS for MariaDB 10.11 to test the performance improvements and feature innovations with your workload.
About the author
Sai Kiran Kshatriya is an Amazon Web Services Database Specialist Solutions Architect who specializes in Relational Database Engines. He provides Technical Assistance, operational, and database practices to customers.