AWS Database Blog

Deploy schema changes in an Amazon Aurora MySQL database with minimal downtime

Modifying the schema of a SQL database can be time-consuming, resource-intensive, and error-prone. It can also require long periods of downtime that negatively affects the end-user experience. In this post, I walk you through performing schema changes using Instant DDL and Amazon Relational Database Service (Amazon RDS) Blue/Green Deployments for Amazon Aurora MySQL-Compatible Edition with minimal downtime on production workloads.

Overview

Aurora MySQL version 3 is compatible with the MySQL 8.0 feature called Instant DDL. This feature significantly speeds up supported schema changes like adding a column, setting or dropping column default values, renaming a table, and more. Instant operations only modify metadata in the data dictionary. An exclusive metadata lock on the table may be taken briefly during the execution phase of the operation but table data is unaffected, making operations instantaneous.

For schema changes that cannot be done using the INSTANT algorithm, you can use Amazon RDS Blue/Green Deployments that lets you perform schema changes on staging environment that are isolated from the production system. With this feature, you can create a staging environment (Green) that is a topological copy of your current production environment (Blue). The Blue and Green environments stay in sync using binlog replication.

The following diagram illustrates RDS Blue/Green deployment architecture.Blue-green-architecture-diag

In the following sections, we demonstrate how to apply schema changes for an Aurora MySQL cluster using instant DDL and RDS Blue/Green Deployments.

Prerequisites

To implement this solution, you need a database in an Aurora cluster for which you want to perform schema changes. You can also create a new Aurora MySQL cluster and then load the employees dataset to follow along with this post.

Instant DDL with Aurora MySQL version 3

You can use the instant DDL feature by using the ALGORITHM=INSTANT clause of the ALTER TABLE statement. In the example below, we are going to connect to the Aurora MySQL cluster and ADD COLUMN to the employees table and verify the behavior. Make sure to test on your development and non-production environments before you perform ALTER TABLE operations in your production environment.

  1. Connect to the Aurora MySQL cluster as an existing user. In this post we use an admin account to connect to the cluster aurora-db using MySQL utility installed on AWS Cloud9 instance.
    mysql -h <clusterEndpoint> -u <admin> -p <masterpassword>

For example :

Test$ mysql -h aurora-db.cluster-xxxxxxx.us-east-1.rds.amazonaws.com -u admin -p employees
  1. You can run the following command by replacing the placeholders and adding a column to the end of the table instantly.
    ALTER TABLE tbl_name ADD COLUMN column_name column_definition, ALGORITHM=INSTANT;
  2. In the following example, we add the column email to the end of employees table which has approximately 300k rows.
    mysql> select count(*) from employees;
    +----------+
    | count(*) |
    +----------+
    | 300024 |
    +----------+
    
    1 row in set (1.85 sec))
    
    mysql> select * from employees limit 2;
    +--------+------------+------------+-----------+--------------+--------+------------+
    | emp_no | birth_date | first_name | last_name | phone_number | gender | hire_date |
    +--------+------------+------------+-----------+--------------+--------+------------+
    | 10001 | 1953-09-02 | Georgi      | Facello   | NULL         | M      | 1986-06-26 |
    | 10002 | 1964-06-02 | Bezalel     | Simmel    | NULL         | F      | 1985-11-21 |
    +--------+------------+------------+-----------+--------------+--------+------------+
    2 rows in set (0.09 sec)
    
    mysql> ALTER TABLE employees ADD COLUMN email varchar(50), ALGORITHM = INSTANT;
    Query OK, 0 rows affected (0.12 sec)
    Records: 0 Duplicates: 0 Warnings: 0
    
    mysql> select * from employees limit 2;
    
    +--------+------------+------------+-----------+--------------+-------+----------+-------+
    | emp_no | birth_date | first_name | last_name | phone_number | gender| hire_date| email|
    +--------+------------+------------+-----------+--------------+-------+----------+-------+
    | 10001 | 1953-09-02 | Georgi      | Facello   | NULL         | M     | 1986-06-26| NULL|
    | 10002 | 1964-06-02 | Bezalel     | Simmel    | NULL         | F     | 1985-11-21| NULL|
    +--------+------------+------------+-----------+--------------+--------+----------+-------+
    2 rows in set (0.09 sec)

Instant DDL is only available for a limited subset schema change operations. See the full breakdown in the MySQL documentation.

For changes such as the following, you may consider RDS Blue/Green Deployments.

  • Adding and modifying indexes
  • Partitioning a table
  • Optimizing and rebuilding a table to reclaim storage space
  • Other operations that require a substantial reorganization of the table

Let’s look at some code examples of various schema changes that can be done using RDS Blue/Green Deployments.

Create a RDS Blue/Green Deployment for your Aurora MySQL cluster

The Aurora MySQL cluster must have binary logging enabled for replication from the blue environment to the green environment. While any binlog format works, we recommend ROW to reduce the risk of replication inconsistencies. You can enable binary logs by setting the binlog_format to ROW through a custom DB cluster parameter group. Note that this parameter is a static parameter and the nodes in the cluster will require a manual reboot to reflect the parameter change.

We can use an AWS CLI command to verify the status of the parameter group with the nodes in the cluster. In the following example, the in-sync value for DBClusterParameterGroupStatus indicates that the change has been applied:

Test$ aws rds describe-db-clusters --db-cluster-identifier aurora-db --query 'DBClusters[*].DBClusterMembers'
[
    [
        {
            "DBInstanceIdentifier": "aurora-db-writer",
            "IsClusterWriter": true,
            "DBClusterParameterGroupStatus": "in-sync",
            "PromotionTier": 1
        }
    ]
]

Once the binlog_format is modified, you can now create the Blue/Green Deployment for Aurora MySQL cluster through the AWS CLI, Amazon RDS API, or console. For more information, refer to Creating the Blue/Green Deployments.

For this post, we use an AWS Cloud9 instance to access the AWS CLI and Aurora MySQL cluster. See the following example:

aws rds create-blue-green-deployment \
    --blue-green-deployment-name aurora-blue-green-deployment \
    --source arn:aws:rds:us-east-1:xxxxxxxx:cluster:aurora-db \
    --target-db-cluster-parameter-group binlog-enable-version-3 \
    --target-db-parameter-group-name default.aurora-mysql8.0

To check the source and target clusters of the Blue/Green Deployment, run the following command:

aws rds describe-blue-green-deployments \
 --Blue-green-deployment-identifier <resource identifier> \
 --query 'blueGreenDeployments[*].[Source, Target]'

You can retrieve the Blue/Green Deployment identifier by running the following AWS CLI command:

aws rds describe-blue-green-deployments \
--filters Name=blue-green-deployment-name,Values=<name of your Blue/Green deployment> \
--query 'BlueGreenDeployments[*].[BlueGreenDeploymentIdentifier]

For example:

Test$ aws rds describe-blue-green-deployments \
--filters Name=blue-green-deployment-name,Values=aurora-blue-green-deployment \
--query 'BlueGreenDeployments[*].[BlueGreenDeploymentIdentifier]'

[
    [
        "bgd-3n9xxxxxxxizv69"
    ]
]

Test$ aws rds describe-blue-green-deployments \
--blue-green-deployment-identifier bgd-3n9xxxxxxxizv69 \
--query 'BlueGreenDeployments[*].[Source, Target]'
[
    [
        "arn:aws:rds:us-east-1:12345678912:cluster:aurora-db",
        "arn:aws:rds:us-east-1:12345678912:cluster:aurora-db-green-ikgjlt"
    ]
]

Connect to the Green Aurora MySQL cluster

Connect to the clone Aurora MySQL database in the Green environment as an existing user. The native user credentials remain the same as Blue. However, IAM DB authentication credentials are not mirrored to Green cluster. If you are using RDS IAM authentication, set up IAM credentials for the Green Aurora MySQL cluster separately.

In this example, we use the admin account that has create and alter table privileges for employees database to connect to the cluster aurora-db-green-ikfake. We strongly recommend that you do not use the master user directly. Instead, adhere to the best practice of using a database user created with the minimal privileges required for this operation.

mysql -h <clusterEndpoint> -u <admin> -p <masterpassword>

See the following example:

Test-green$ mysql -h aurora-db-green-ikfake.cluster-xxxxxxx.us-east-1.rds.amazonaws.com -u admin -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 38
Server version: 8.0.23 Source distribution

Let’s check the tables that are available in our employees database:

Mysql-green> use employees;

Database changed

mysql-green> show tables;
+----------------------+
| Tables_in_employees  |
+----------------------+
| current_dept_emp     |
| departments          |
| dept_emp             |
| dept_emp_latest_date |
| dept_manager         |
| employees            |
| salaries             |
| titles               |
+----------------------+
8 rows in set (0.08 sec)

Apply schema changes in the Green environment

Rebuilding a table

Rebuilding a table helps optimize the underlying storage and reclaim the unused space lost due to fragmentation. This can not only reduce storage costs, but may also improve query performance.

In the following example, we are going to rebuild a table to reclaim the space lost due to fragmentation. We use the Blue cluster and the salaries table in the employees database that we loaded as a prerequisite.

Test-blue$ mysql -h aurora-db-cluster.cluster-cxxxxxxxxx.us-east-1.rds.amazonaws.com -u admin -p employees
Enter password:
Server version: 8.0.26 Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.  

There are approximately 2.8 million rows in the salaries table and the space occupied by this table is approximately 95 MB.

Mysql-blue> select count(*) from salaries;

+----------+
| count(*) |
+----------+
| 2844047 |
+----------+
1 row in set (0.33 sec)

Let’s check the fragmented space for the table after updating statistics by running the ANALYZE TABLE command on salaries table.

Mysql-blue> analyze table salaries;

+--------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| employees.salaries | analyze | status | OK |
+--------------------+---------+----------+----------+

1 row in set (0.12 sec)

We see the DATAFREE is 4 MB by running the following SQL command.

Mysql-blue> SELECT
table_schema as 'DATABASE',
table_name as 'TABLE',
CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') 'TOTAL',
CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'M') 'DATAFREE'
FROM information_schema.TABLES
where table_schema='employees' and table_name='salaries';

+-----------+----------+--------+----------+
| DATABASE | TABLE | TOTAL | DATAFREE |
+-----------+----------+--------+----------+
| employees | salaries | 95.63M | 4.00M |
+-----------+----------+--------+----------+
1 row in set (0.09 sec)

As the rebuild operation locks the table making it inaccessible to the user, we can run the rebuild operation on the salaries table in the Green environment, isolating it from the Blue environment.

After logging into the Green environment, run the following ALTER TABLE command to start the rebuild to reflect the actual size of the table. The duration of the rebuild will depend on the size of the table. The larger the table size, the longer the ALTER TABLE runs.

Test$ mysql -h aurora-db-cluster-green-ikfake.cluster-cxxxxxxxx.us-east-1.rds.amazonaws.com -u admin -p employees

Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 314
Server version: 8.0.26 Source distribution

Mysql-green> ALTER TABLE salaries engine=innodb;
Query OK, 0 rows affected (4.39 sec)
Records: 0 Duplicates: 0 Warnings: 0

Let’s check the size of the table by running the following code. We see the table size increased to 109 MB.

Mysql-green > analyze table salaries;

+--------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| employees.salaries | analyze | status | OK |
+--------------------+---------+----------+----------+
1 row in set (0.11 sec)

Mysql-green > SELECT
table_schema as 'DATABASE',
table_name as 'TABLE',
CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') 'TOTAL',
CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'M') 'DATAFREE'
FROM information_schema.TABLES
where table_schema='employees' and table_name='salaries';

+-----------+----------+---------+----------+
| DATABASE | TABLE | TOTAL | DATAFREE |
+-----------+----------+---------+----------+
| employees | salaries | 109.63M | 2.00M |
+-----------+----------+---------+----------+
1 row in set (0.09 sec)

Now, delete 500,000 records from the salaries table in the Blue environment to demonstrate how deletion of records can cause fragmented space.

Test-blue$ mysql -h aurora-db-cluster.cluster-cxxxxxxxxx.us-east-1.rds.amazonaws.com -u admin -p employees

Enter password:
Server version: 8.0.26 Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql-blue> delete from salaries where emp_no>209071 order by emp_no limit 500000;
Query OK, 500000 rows affected (7.39 sec)

We follow the same steps as before by logging into the Green environment. Verify the fragmented space.

Test-green$ mysql -h aurora-db-cluster-green-ikfake.cluster-cxxxxxxxx.us-east-1.rds.amazonaws.com -u admin -p employees

Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 314
Server version: 8.0.26 Source distribution

mysql-green> analyze table salaries;

+--------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| employees.salaries | analyze | status | OK |
+--------------------+---------+----------+----------+
1 row in set (0.10 sec)

After deletion of 500,000 records in Blue cluster, we see the DATAFREE increased to 12 MB from 2MB and table size decreased from 109 MB to 99 MB in Green cluster.

Mysql-green> SELECT
-> table_schema as 'DATABASE',
-> table_name as 'TABLE',
-> CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') 'TOTAL',
-> CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'M') 'DATAFREE'
-> FROM information_schema.TABLES
-> where table_schema='employees' and table_name='salaries';

+-----------+----------+--------+----------+
| DATABASE | TABLE | TOTAL | DATAFREE |
+-----------+----------+--------+----------+
| employees | salaries | 99.61M | 12.00M |
+-----------+----------+--------+----------+
1 row in set (0.09 sec)

To reclaim the free space shown in DATAFREE, we need to rebuild the table again, using the following ALTER TABLE command.

Mysql-green> alter table salaries engine=innodb;

Query OK, 0 rows affected (3.03 sec)
Records: 0 Duplicates: 0 Warnings: 0

After running the following SQL commands, we see that the space has been reclaimed. Salaries table size is now 89 MB and DATAFREE is 3 MB.

Mysql-green> analyze table salaries;

+--------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------+---------+----------+----------+
| employees.salaries | analyze | status | OK |
+--------------------+---------+----------+----------+
1 row in set (0.10 sec)

Mysql-green> SELECT table_schema as 'DATABASE', table_name as 'TABLE', CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 ), 2), 'M') 'TOTAL', CONCAT(ROUND(data_free / ( 1024 * 1024 ), 2), 'M') 'DATAFREE' FROM information_schema.TABLES where table_schema='employees' and table_name='salaries';

+-----------+----------+--------+----------+
| DATABASE | TABLE | TOTAL | DATAFREE |
+-----------+----------+--------+----------+
| employees | salaries | 89.61M | 3.00M |
+-----------+----------+--------+----------+

1 row in set (0.09 sec)

Reorganize a partition

Most tables require database maintenance, such as reorganizing the partitions, which may lock the partition until the ALTER operation is complete. Therefore, we’re going to reorganize a partition for table titles on the Green environment and then verify replication by inserting a few records on the Blue environment.

If you’re following along with this post using the employees database, to prepare the Blue environment, we drop the titles table and recreate it with partitions on the Blue environment and load the data into that table again. See the following code:

Mysql-blue> drop table titles;
Query OK, 0 rows affected (0.10 sec)

mysql-blue> CREATE TABLE titles (
    emp_no      INT             NOT NULL,
    title       VARCHAR(50)     NOT NULL,
    from_date   DATE            NOT NULL,
    to_date     DATE,
    #FOREIGN KEY (emp_no) REFERENCES employees (emp_no) ON DELETE CASCADE,
    PRIMARY KEY (emp_no,title, from_date)
); 
Query OK, 0 rows affected (0.11 sec)

mysql-blue> ALTER TABLE titles
partition by range COLUMNS (from_date)
(
    partition p01 values less than ('1993-12-31'),
    partition p02 values less than ('2002-12-31'),
    partition p03 values less than (MAXVALUE)
);
Query OK, 0 rows affected (0.18 sec)

mysql> source load_titles.dump;
Query OK, 15401 rows affected (0.64 sec)

Now let’s reorganize the first partition to expand on the range of years on the Green environment by running the following SQL command:

Mysql-green> ALTER TABLE titles
REORGANIZE PARTITION p01 INTO
(
    partition s01 values less than ('1985-12-31'),
    partition s02 values less than ('1986-12-31'),
    partition s03 values less than ('1987-12-31'),
    partition s04 values less than ('1988-12-31'),
    partition s05 values less than ('1989-12-31'),
    partition s06 values less than ('1990-12-31'),
    partition s07 values less than ('1991-12-31'),
    partition s08 values less than ('1992-12-31'),
    partition s09 values less than ('1993-12-31')
);
Query OK, 0 rows affected (1.50 sec)

For this example, we update a record in the p01 partition in the titles table on the Blue environment to see if it’s replicating to the Green environment without errors even though it’s been reorganized:

mysql-blue> select * from titles where emp_no in (10013);
+--------+-----------------+------------+------------+
| emp_no | title           | from_date  | to_date    |
+--------+-----------------+------------+------------+
|  10013 | Senior Staff    | 1985-10-20 | 9999-01-01 |
+--------+-----------------+------------+------------+

mysql-blue> UPDATE titles SET title = 'Senior Engineer'
WHERE emp_no = 10013 and from_date='1985-10-20';

Query OK, 2 rows affected (0.08 sec)

Check for replication status using show replica status\G; and run the following SQL command to verify if the data is replicated to Green environment:

Mysql-green> select * from titles where emp_no in (10013);
+--------+-----------------+------------+------------+
| emp_no | title           | from_date  | to_date    |
+--------+-----------------+------------+------------+
|  10013 | Senior Engineer | 1985-10-20 | 9999-01-01 |
+--------+-----------------+------------+------------+

Switch over a Blue/Green deployment

Once we complete the schema changes on the Green environment, we are now ready to switch over. We recommend you switch your databases in the green environment to read only by setting read_only database parameter to 1 in the DB cluster parameter group to avoid replication conflicts or unintended data in Green environment.

A switchover promotes the DB cluster in the Green environment to be the new production DB cluster and as part of this process, RDS renames the DB cluster and DB instances in the Green environment to match the corresponding DB cluster and DB instances in the Blue environment.

You can specify a switchover timeout , your maximum tolerable downtime period between 30 seconds and 3,600 seconds (1 hour). If RDS cannot complete the switchover within the specified timeout, then it is cancelled and production traffic continues to flow to the Blue Aurora MySQL cluster. The default timeout period is 300 seconds (5 minutes).

When you start the switchover process, Amazon RDS runs some basic checks to test the readiness of the Blue and Green environment known as switchover guardrails. These are checks that prevent switchover operation if the environments aren’t ready for it.

We strongly recommend that you adhere to best practices before planning to switch over. When you’re ready, run the following AWS CLI command, and specify the time limit for the switchover in seconds (the default is 300):

aws rds switchover-blue-green-deployment \
--blue-green-deployment-identifier <your Blue/Green deployment identifier> \
--switchover-timeout <value>

For example:

aws rds switchover-blue-green-deployment \
--blue-green-deployment-identifier bgd-3n9xxxxxxxizv69 \
--switchover-timeout 600

To verify if your switchover is progressing, you can check using the following code:

aws rds describe-blue-green-deployments \
--blue-green-deployment-identifier <your Blue/Green deployment identifier> \
--query 'BlueGreenDeployments[*].SwitchoverDetails[*].Status[]'

For example:

Test $ aws rds describe-blue-green-deployments \
--blue-green-deployment-identifier bgd-3n9xxxxxxxizv69 \
--query 'BlueGreenDeployments[*].SwitchoverDetails[*].Status[]'

[
    "SWITCHOVER_IN_PROGRESS",
    "SWITCHOVER_IN_PROGRESS"
]

When the switchover is complete, we can observe that the DB cluster and instances in the Green environment are renamed to match the DB cluster and instances in the Blue environment so that application changes aren’t required. It is your responsibility to have applications detect this DNS name change after switchover to Green environment and route production traffic.

See the following code:

Test$ aws rds describe-blue-green-deployments --blue-green-deployment-identifier bgd-3n9xxxxxxxizv69 --query 'BlueGreenDeployments[*].SwitchoverDetails[*].Status[]'
[
    "SWITCHOVER_COMPLETED",
    "SWITCHOVER_COMPLETED"
]

After switching over, Blue/Green Deployments do not delete your old production environment. RDS renames the DB cluster and DB instances in the Blue environment by appending -oldn to the current name, where n is a number. You may access it for additional validations and performance/regression testing, if needed.

Test$ aws rds describe-blue-green-deployments --blue-green-deployment-identifier bgd-3n9xxxxxxxizv69 --query 'BlueGreenDeployments[*].[Source, Target]'
[
    [
        "arn:aws:rds:us-east-1:12345678912:cluster:aurora-db-old1",
        "arn:aws:rds:us-east-1:12345678912:cluster:aurora-db"
    ]
]

It is your responsibility to delete the old production environment when you no longer need it. Standard billing charges apply on old production instances until you delete them.

Limitations

In addition to the limitations for Blue/Green Deployments, we need to account for binlog replication limitations when considering schema changes:

Now let’s look at a use case where a schema change is not possible using the RDS Blue/Green deployment feature where we encounter a limitation with binlog replication.

Add columns to the middle of the table

In this scenario, I created another Blue/Green deployment for a new Aurora cluster with employees dataset. I am running the following ALTER TABLE statement on the employees table on the Green Aurora MySQL cluster which adds a new column phone_number that comes between columns common to both versions of the employees table. See the following code:

Mysql-green> ALTER TABLE employees
ADD COLUMN phone_number varchar(15) AFTER last_name;
Query OK, 0 rows affected (0.95 sec)

Now I insert a record into the employees table on the Blue environment

Mysql-blue> Insert into employees values
(500000,'1987-05-01','John','Doe','M','2000-11-30');

To review replication status on the Green environment, I run show replica status\G SQL command on the Green Aurora MySQL cluster that shows the following error:

Last_SQL_Error: Column 4 of table 'employees.employees' cannot be converted from type 'enum' to type 'varchar(60(bytes) utf8mb4)'

We are unable to switchover due to the replication error we encountered. Hence, it’s essential for you to plan schema changes with the tools that best fit the use case.

Clean up

To clean up after switchover, delete the database instances in the Blue environment. If you created a new Aurora MySQL cluster to follow along with this post, delete the Blue/Green Deployment object and Green environment as well. For instructions, see Deleting Aurora DB clusters and DB instances.

Conclusion

In this post, we covered how you can implement schema changes using INSTANT DDL and Amazon RDS Blue/Green Deployments in Amazon Aurora MySQL-Compatible Edition with minimal impact. You can apply the same approach to Amazon RDS for MySQL and RDS for MariaDB instances. Get started with RDS Blue/Green Deployments via the Amazon RDS Console.

We welcome your feedback; leave your comments or questions in the comments section.


About the author

Adarsha Kuthuru is a Database Specialist Solutions Architect at Amazon Web Services. She works with customers to design scalable, highly available and secure solutions in the AWS Cloud. Outside of work, you can find her painting, reading or hiking in Pacific North West.