AWS Database Blog

How to encrypt database columns with no impact on your application using AWS DMS and Baffle

AWS offers a wealth of security features to protect its infrastructure and services, such as AWS Identity and Access Management (IAM) and AWS Key Management Service (AWS KMS). AWS Data Migration Service (AWS DMS) provides users a simple automated way of migrating data from their existing databases to Amazon RDS. As part of this, AWS DMS uses some AWS security features and makes them available to our customers. For example, DMS already supports Secure Socket Layer (SSL) encryption for database connections and encryption of data at rest using AWS KMS keys, among other security features.

Nonetheless, some enterprises face additional regulatory compliance mandates or specific security policies when migrating to databases in the cloud. These enterprises need to enable appropriate data protection measures within the cloud Shared Responsibility Model. For these users, Baffle provides a column-level encryption mechanism to protect their most sensitive data. Businesses can implement this mechanism without any application changes while using DMS.

This ability to encrypt without making application changes enables enterprises to add data privacy to existing and new applications. It ensures that any application data is inaccessible by backend administrators or third parties. This application transparency and privacy protection extend to the tools DMS uses to move data between databases. When you work with these tools, your data can be encrypted as it’s migrated to Amazon RDS. Data can be encrypted without any custom integration with DMS and with little risk of data being exposed during and after its migration to AWS.

In this blog post, we go over how to use Baffle’s Advanced Data Protection solution to encrypt the databases that you are migrating to RDS. Baffle’s approach helps ensure that your data is never unprotected, whether in memory or at rest, while it’s in the cloud. As you can see following, there’s virtually no change to the standard DMS migration workflow.

How it works

The primary component responsible for encrypting and decrypting data is the BaffleShield SQL proxy. It’s part of the Baffle Advanced Data Protection solution on AWS Marketplace. You also can directly obtain and license it from us at Baffle (info@baffle.io).

To ensure that the data is migrated to RDS in encrypted form, we deploy BaffleShield and make sure that the data being migrated is routed through BaffleShield. In the end, the data flow should look like the picture following.System Overview

Deployment and use

To use Baffle Advanced Data Protection to encrypt data as it’s migrated to RDS, you need the following prerequisites:

  • The Java 1.8 runtime environment on the source database host
  • BaffleShield deployed on the source database host
  • A DMS replication instance

To help you familiarize yourself with using Baffle Advanced Data Protection with DMS migration, we have created an AWS CloudFormation template that satisfies the prerequisites listed preceding. This template includes the following:

  • A MySQL 5.7 database on an Amazon EC2 instance that contains sample sensitive data
  • A BaffleShield installation ready for customization
  • An RDS for MySQL instance to serve as the migration target
  • A DMS replication instance to perform the migration

You can locate the template on GitHub and download it by using the following command, replacing the highlighted information with your own:

git clone https://github.com/awslabs/aws-database-migration-tools

Before we begin the migration process, we first log into the source database to verify that it was properly set up. To identify the EC2 instance, we navigate to the Resources tab for the CloudFormation stack that we just created. We then click the link next to SourceDatabase, shown in the following screenshot.Screenshot of the SourceDatabase field on Resources tab

Clicking the link takes us to the EC2 dashboard, which provides details about the new EC2 instance that CloudFormation created. We want to copy the public IP address for this instance and paste it as the host name for MySQL Workbench to connect to root. The user name to connect with is root. When prompted, use BaffleDMSDemo as the password.Screenshot of SQL Workbench to connect to database

After connecting, execute a select * command against the demo.billing table to reveal columns that contains credit-card transaction information required for billing customers. Needless to say, credit card transactions are generally considered sensitive information. To be compliant with the Payment Card Industry Data Security Standard (PCI DSS), most columns in this table must be encrypted.Screenshot of query execution on SQL Workbench

We want to migrate this database to RDS and make sure that the sensitive data columns are encrypted. To do so, we need to make BaffleShield become a proxy for RDS so that BaffleShield can encrypt any data being migrated by DMS to RDS.

Making BaffleShield a proxy

The first step in this process is to get the host information for the RDS instance. Again, we can find the RDS instance created by CloudFormation in the Resources tab for the stack. Alternatively, we can go straight to the RDS console, where we can find a DB instance that matches the name that we specified when we created the stack. Using either approach, we can navigate to the connection information for the RDS instance and copy the host name.Screenshot of RDS instance details

We need to paste this RDS host name into the start_bs script where the MYSQL_SERVER_HOST variable is being defined. To do so, use a Secure Shell (SSH) client configured to use the key pair specified when creating the CloudFormation stack. Use the client to connect to the IP address of the source database instance. The user name should be centos. After connecting to the source database instance, navigate to the /home/centos/baffle/baffle-manager directory and edit the start_bs file to set the RDS host name.

After saving the modified start_bs script, we also need to define the columns to encrypt. Typically, we do so by modifying the BafflePrivacySchema file. For this demonstration, the BafflePrivacySchema file is already configured to encrypt all columns of the billing table, except for TxnID and OrderDate. All columns except for these contain data that is considered sensitive by one or more compliance requirements. We can view the configuration by opening the file with an editor.

Be aware that you must define all columns of a table in BafflePrivacySchema even if a column isn’t encrypted. Unencrypted columns are identified with the value -1 after the column name.Screenshot of the schema structure

We can now launch BaffleShield by running the start_bs shell script.

With BaffleShield running, next we route the DMS migration process through BaffleShield so that the data is encrypted as it’s migrated to RDS. We can do this by making BaffleShield the destination endpoint for migration. Because BaffleShield is proxying for the RDS instance, any data being inserted into BaffleShield is sent by BaffleShield to the RDS instance, but in encrypted form.

To understand this, let’s go through the entire workflow. First, we set up the source endpoint in DMS as normal. Be sure to specify the replication instance that was created as part of the stack. The instance name should start with bafflemigrationdemodms.Screenshot of Create Endpoint pane for DMS

Next, we set up the destination endpoint. However, instead of specifying the RDS server name and port, we specify the BaffleShield host name and port. If you recall, we deployed BaffleShield on the source DB host. The server name stays the same, but the port is now 8444. (Be sure that the firewall rules allow the DMS instance to connect on port 8444). This port is the default for BaffleShield, but you can change it in the configuration file if needed. Also, the user name and password should be the ones you created for the RDS instance because BaffleShield is proxying for RDS.Screenshot of Create Endpoint pane for DMS

After you create both endpoints, you should see the following in the endpoints section of the DMS console.Screenshot of the list of the DMS endpoints

After we specify the source and destination endpoints, we can create a DMS task to migrate the data. We do so in the same way that any other DMS migration task is created, as seen in the screenshot following.Create a DMS migration task

After you create the tasks, DMS begins the migration process, as with any other migration.Screenshot of the current DMS tasks

After this process completes, we can connect to the RDS instance to verify that the data is, in fact, migrated and encrypted. In this case, we used MySQL Workbench to establish a connection to the RDS instance to dump the table. We can see that, other than the TxnID and the OrderDate columns, all columns are encrypted.Screenshot of MySQL Workbench

That it! As promised, there’s hardly any change to the standard DMS workflow. After we set up BaffleShield to proxy for the original target RDS instance and specify it as the destination endpoint for the DMS task, everything else just works as before. The only difference is that all of the sensitive data is encrypted in the RDS instance.

Baffle can help you secure your data as part of your cloud migration plans. Baffle can also help improve your security posture for your on-premises systems. To learn more about both these features, contact us at info@baffle.io.


About the Author

Min-Hank Ho leads engineering at Baffle, Inc. Prior to joining Baffle, he was responsible for the development of all cryptography-based features in the Oracle database including Transparent Data Encryption and has been awarded seven patents in the area of database security.