AWS Database Blog
How to encrypt database columns with no impact on your application using AWS DMS and Baffle
AWS offers a wealth of security features to protect its infrastructure and services, such as AWS Identity and Access Management (IAM) and AWS Key Management Service (AWS KMS). AWS Data Migration Service (AWS DMS) provides users a simple automated way of migrating data from their existing databases to Amazon RDS. As part of this, AWS DMS uses some AWS security features and makes them available to our customers. For example, DMS already supports Secure Socket Layer (SSL) encryption for database connections and encryption of data at rest using AWS KMS keys, among other security features.
Nonetheless, some enterprises face additional regulatory compliance mandates or specific security policies when migrating to databases in the cloud. These enterprises need to enable appropriate data protection measures within the cloud Shared Responsibility Model. For these users, Baffle provides a column-level encryption mechanism to protect their most sensitive data. Businesses can implement this mechanism without any application changes while using DMS.
This ability to encrypt without making application changes enables enterprises to add data privacy to existing and new applications. It ensures that any application data is inaccessible by backend administrators or third parties. This application transparency and privacy protection extend to the tools DMS uses to move data between databases. When you work with these tools, your data can be encrypted as it’s migrated to Amazon RDS. Data can be encrypted without any custom integration with DMS and with little risk of data being exposed during and after its migration to AWS.
In this blog post, we go over how to use Baffle’s Advanced Data Protection solution to encrypt the databases that you are migrating to RDS. Baffle’s approach helps ensure that your data is never unprotected, whether in memory or at rest, while it’s in the cloud. As you can see following, there’s virtually no change to the standard DMS migration workflow.
How it works
The primary component responsible for encrypting and decrypting data is the BaffleShield SQL proxy. It’s part of the Baffle Advanced Data Protection solution on AWS Marketplace. You also can directly obtain and license it from us at Baffle (info@baffle.io).
To ensure that the data is migrated to RDS in encrypted form, we deploy BaffleShield and make sure that the data being migrated is routed through BaffleShield. In the end, the data flow should look like the picture following.
Deployment and use
To use Baffle Advanced Data Protection to encrypt data as it’s migrated to RDS, you need the following prerequisites:
- The Java 1.8 runtime environment on the source database host
- BaffleShield deployed on the source database host
- A DMS replication instance
To help you familiarize yourself with using Baffle Advanced Data Protection with DMS migration, we have created an AWS CloudFormation template that satisfies the prerequisites listed preceding. This template includes the following:
- A MySQL 5.7 database on an Amazon EC2 instance that contains sample sensitive data
- A BaffleShield installation ready for customization
- An RDS for MySQL instance to serve as the migration target
- A DMS replication instance to perform the migration
You can locate the template on GitHub and download it by using the following command, replacing the highlighted information with your own:
Before we begin the migration process, we first log into the source database to verify that it was properly set up. To identify the EC2 instance, we navigate to the Resources tab for the CloudFormation stack that we just created. We then click the link next to SourceDatabase
, shown in the following screenshot.
Clicking the link takes us to the EC2 dashboard, which provides details about the new EC2 instance that CloudFormation created. We want to copy the public IP address for this instance and paste it as the host name for MySQL Workbench to connect to root
. The user name to connect with is root. When prompted, use BaffleDMSDemo
as the password.
After connecting, execute a select *
command against the demo.billing
table to reveal columns that contains credit-card transaction information required for billing customers. Needless to say, credit card transactions are generally considered sensitive information. To be compliant with the Payment Card Industry Data Security Standard (PCI DSS), most columns in this table must be encrypted.
We want to migrate this database to RDS and make sure that the sensitive data columns are encrypted. To do so, we need to make BaffleShield become a proxy for RDS so that BaffleShield can encrypt any data being migrated by DMS to RDS.
Making BaffleShield a proxy
The first step in this process is to get the host information for the RDS instance. Again, we can find the RDS instance created by CloudFormation in the Resources tab for the stack. Alternatively, we can go straight to the RDS console, where we can find a DB instance that matches the name that we specified when we created the stack. Using either approach, we can navigate to the connection information for the RDS instance and copy the host name.
We need to paste this RDS host name into the start_bs
script where the MYSQL_SERVER_HOST
variable is being defined. To do so, use a Secure Shell (SSH) client configured to use the key pair specified when creating the CloudFormation stack. Use the client to connect to the IP address of the source database instance. The user name should be centos
. After connecting to the source database instance, navigate to the /home/centos/baffle/baffle-manager directory and edit the start_bs
file to set the RDS host name.
After saving the modified start_bs
script, we also need to define the columns to encrypt. Typically, we do so by modifying the BafflePrivacySchema
file. For this demonstration, the BafflePrivacySchema
file is already configured to encrypt all columns of the billing table, except for TxnID
and OrderDate
. All columns except for these contain data that is considered sensitive by one or more compliance requirements. We can view the configuration by opening the file with an editor.
Be aware that you must define all columns of a table in BafflePrivacySchema
even if a column isn’t encrypted. Unencrypted columns are identified with the value -1
after the column name.
We can now launch BaffleShield by running the start_bs
shell script.
With BaffleShield running, next we route the DMS migration process through BaffleShield so that the data is encrypted as it’s migrated to RDS. We can do this by making BaffleShield the destination endpoint for migration. Because BaffleShield is proxying for the RDS instance, any data being inserted into BaffleShield is sent by BaffleShield to the RDS instance, but in encrypted form.
To understand this, let’s go through the entire workflow. First, we set up the source endpoint in DMS as normal. Be sure to specify the replication instance that was created as part of the stack. The instance name should start with bafflemigrationdemodms
.
Next, we set up the destination endpoint. However, instead of specifying the RDS server name and port, we specify the BaffleShield host name and port. If you recall, we deployed BaffleShield on the source DB host. The server name stays the same, but the port is now 8444. (Be sure that the firewall rules allow the DMS instance to connect on port 8444). This port is the default for BaffleShield, but you can change it in the configuration file if needed. Also, the user name and password should be the ones you created for the RDS instance because BaffleShield is proxying for RDS.
After you create both endpoints, you should see the following in the endpoints section of the DMS console.
After we specify the source and destination endpoints, we can create a DMS task to migrate the data. We do so in the same way that any other DMS migration task is created, as seen in the screenshot following.
After you create the tasks, DMS begins the migration process, as with any other migration.
After this process completes, we can connect to the RDS instance to verify that the data is, in fact, migrated and encrypted. In this case, we used MySQL Workbench to establish a connection to the RDS instance to dump the table. We can see that, other than the TxnID
and the OrderDate
columns, all columns are encrypted.
That it! As promised, there’s hardly any change to the standard DMS workflow. After we set up BaffleShield to proxy for the original target RDS instance and specify it as the destination endpoint for the DMS task, everything else just works as before. The only difference is that all of the sensitive data is encrypted in the RDS instance.
Baffle can help you secure your data as part of your cloud migration plans. Baffle can also help improve your security posture for your on-premises systems. To learn more about both these features, contact us at info@baffle.io.
About the Author
Min-Hank Ho leads engineering at Baffle, Inc. Prior to joining Baffle, he was responsible for the development of all cryptography-based features in the Oracle database including Transparent Data Encryption and has been awarded seven patents in the area of database security.