Automate the Amazon Aurora MySQL blue/green deployment process
Blue/green deployment techniques provide near zero-downtime release and rollback capabilities. This requires two identical environments that are running different versions of your application. This also extends to the underlying data layer, where you need two identical Amazon Aurora MySQL-Compatible Edition database clusters running in sync with the same dataset for seamless testing. After completing tests on the green environment, traffic is switched from the blue to the green environment.
The same technique can also be used for continuous integration and continuous delivery (CI/CD) pipelines when code changes and new features must be verified against existing online data. With CI/CD pipelines, you need multiple DB clusters for development, testing, quality assurance, and acceptance environments. You often also want to test code changes and new features against live data as it comes in. Changes to the underlying data model result in schema changes in the database. Schema changes must be thoroughly tested to mitigate the risk of application downtime. It is a best practice to avoid testing data definition language (DDL) operations, creation of databases indexes, schema migrations, or analytics work on your production cluster.
In this post, I walk you through the automation solution for blue/green deployments involving Amazon Aurora MySQL. The solution uses Aurora fast cloning, AWS Lambda functions, Amazon CloudWatch Events, and AWS Step Functions.
Aurora fast cloning
Aurora makes it easy to set up, operate, and scale a relational database in the cloud. An Aurora MySQL DB cluster consists of one or more DB instances and a cluster volume that manages the data for those DB instances. It also offers Aurora fast cloning, a feature to quickly and cost-effectively create a new cluster containing a duplicate of an Aurora cluster volume and all its data. Creating a clone is faster and more space-efficient than physically copying the data using a different technique, such as restoring a snapshot.
Aurora fast cloning is ideal for orchestrating blue/green deployments to minimize downtime for maintenance operations. In this scenario, the blue environment represents the production database serving production traffic. The green environment is created as a clone from the production cluster and synched with it. For Aurora MySQL, this is possible using a combination of Aurora fast cloning and binary log replication between blue and green clusters. To accomplish this, you first clone your production cluster (blue cluster); the clone (green cluster) is an exact copy of the blue cluster. Then you set up MySQL binlog replication between both clusters. This replicates all writes from the blue to the green cluster. You can use the clone (green cluster) to test DDL and DCL changes, database upgrades, or run workload-intensive operations, such as exporting data or running analytical queries with no impact to your production cluster.
You can incorporate this same methodology described into your CI/CD workflow to clone your production database cluster and sync the clone with your primary cluster using MySQL binlog replication.
With the green cluster synchronized, you can test the impact of your schema changes, new table design, additional table columns, databases indexes, and more in a matter of minutes with live data and without impacting your production database. To switch over production traffic, you set the blue cluster to read only, ensure that all writes have copied to the green cluster, and then proceed with production traffic.
The following solution offers complete automation for the Aurora blue/green deployment process (see the following architecture diagram).
To test the solution, you must have a source Aurora MySQL cluster already up and running (see Creating an Amazon Aurora DB cluster). This is typically your production cluster. For the rest of this post, we refer to the production cluster as the blue cluster and the clone as the green cluster. Before you start the solution, you must check if MySQL binary logging is enabled on the blue cluster environment and enable it if it’s currently disabled.
Check binary logging status
To check if MySQL binary logging is enabled, you can run either
select @@global.log_bin or
show master status against your blue cluster writer DB instance as follows:
If the output of
select @@global.log_bin is 1, binary logging is enabled. If output is 0, binary logging is disabled.
If the output of
show master status shows the binlog position similar to the following output, binary logging is enabled.
You can clean up the AWS resources created (Lambda functions, CloudWatch events, and Step Function state machine) by deleting the CloudFormation stack.
You can do this from the AWS CloudFormation console or the AWS CLI. To use the console, complete the following steps:
- On the Stacks page on the AWS CloudFormation console, choose the stack that you want to delete. The stack must be currently running.
- In the stack details pane, choose Delete.
- Choose Delete stack when prompted.
Alternatively, you can delete the stack using the AWS CLI delete-stack command:
Deleting the CloudFormation stack doesn’t delete the clone cluster created. The replication user you created is also not deleted on the blue and green clusters. You can drop the replication user from both clusters using the following command:
This post explained how to configure and start Aurora MySQL blue/green automation and switch over to the green cluster. The solution clones the blue cluster and designates the clone as the green cluster. It then sets up MySQL binary log replication between both clusters. This practice is particularly useful to test new application versions with changes to the underlying data store, including significant schema changes with minimal impact on the production cluster. It’s a best practice not to test schema changes against live production databases and instead use the blue green deployment technique to test changes with incoming live data.
When testing is complete and you’re ready to switch over, make sure writes are stopped on the blue cluster and that the green cluster has caught up (read all updates) with the blue cluster. Only then can you update the application connection string to point to the green cluster. We recommend taking a manual snapshot of your blue cluster before starting this automation. For more information on taking manual snapshots, see Creating a DB cluster snapshot.
This post builds up on the post Performing major version upgrades for Amazon Aurora MySQL with minimum downtime.
About the Author
Ahmed Gamaleldin is a Senior Technical Account Manager (TAM) at Amazon Web Services. Ahmed helps customers run optimized workloads on AWS and make the best out of their cloud journey.