AWS DevOps Blog

Enhancing automated database continuous integration with AWS CodeBuild and Amazon RDS Database Snapshot

In major integration merges, it’s sometimes necessary to verify the changes with existing online data. To inspect the changes with a cloned database can give us confidence to deploy to the production database. This post demonstrates how to use AWS CodeBuild and Amazon RDS Database Snapshot to verify your code revisions in both the application layer and the underlying layer, ensuring that your existing data works seamlessly with your revised code.

Making code revisions using continuous integration requires running periodic verification to ensure that your new deliverable works functionally and reliably. It’s easy to focus attention solely on the surface level changes made to the application layer. However, it’s important to remember to inspect the changes made to the underlying data layer too.

From the application layer, users modify the data model for different reasons. Any data model definition change in the application layer maps to a schema change in the database. For those services backed with a relational database (RDBMS), a user might perform data definition language (DDL) operations directly toward a database schema or rely on an object-relational mapping (ORM) library to migrate the schema to fit the application revision. These schema changes (CREATE, DROP, ALTER, TRUNCATE, etc.) can be very critical, especially for those services serving real customers.

Performing proper verification and simulation for these changes mitigates the risk of bringing down services. After the changes are applied, fundamental operation testing (CRUD – CREATE, READ, UPDATE, DELETE) toward data models is mandatory; this leads to data control language (DCL) operations (INSERT, SELECT, UPDATE, DELETE, etc.). After all the necessary steps, a user can move on to the deployment stage.

About this page

  • Time to read:6 minutes
  • Time to complete:30 minutes
  • Cost to complete (estimated):Less than $1 for 1-GB database snapshot and restored instance
  • Learning level:Advanced (300)
  • Services used:AWS CodeBuild, IAM, RDS

Solution overview

This example uses a buildspec file in CodeBuild. Set up a build project that points to a source control repository containing that buildspec file. The CodeBuild runtime environment restores the database server from an RDS snapshot.We restore snapshot to an Amazon Aurora cluster as example through AWS Command Line Interface (AWS CLI). After the database is restored, the build process starts to run your integration process, which is in mock code in the buildspec definition. After the verification stage, CodeBuild drops the restored database.

 

Architecture diagram showing an overview of how we use CodeBuild to restore a database snapshot to verify and validate the new database schema change.

Prerequisites

The following components are required to implement this example:

Walkthrough

Follow these steps to execute the solution.

Prepare your build specification file

Before you begin, prepare your CodeBuild Build Specification file with following information:

  • db-cluster-identifier-prefix
  • db-snapshot-identifier
  • region-ID
  • account-ID
  • vpc-security-group-id

The db-cluster-identifier-prefix creates a temporary database followed by a timestamp. Make sure that this value does not overlap with any other databases. The db-snapshot-identifier points to the snapshot you are calling to run with your application. Region-ID and account-ID describe the account on which you are running. The vpc-security-group-id indicates the security group you use in the CodeBuild environment and temporary database.

YAML
Version: 0.2
phases:
  install:
    runtime-versions:
      python: 3.7
pre_build:
  commands:
    - pip3 install awscli --upgrade --user
    - export DATE=`date +%Y%m%d%H%M`
    - export DBIDENTIFIER=db-cluster-identifier-prefix-$DATE
    - echo $DBIDENTIFIER
    - aws rds restore-db-cluster-from-snapshot --snapshot-identifier arn:aws:rds:region-ID:account-ID:cluster-snapshot:db-snapshot-identifier –vpc-security-group-ids vpc-security-group-id --db-cluster-identifier $DBIDENTIFIER --engine aurora
    - while [ $(aws rds describe-db-cluster-endpoints --db-cluster-identifier $DBNAME | grep -c available) -eq 0 ]; do echo "sleep 60s"; sleep 60; done
    - echo "Temp db ready"
    - export ENDPOINT=$(aws rds describe-db-cluster-endpoints --db-cluster-identifier $DBIDENTIFIER| grep "\"Endpoint\"" | grep -v "\-ro\-" | awk -F '\"' '{print $4}')
    - echo $ENDPOINT
build:
  commands:
    - echo Build started on `date`
    - echo proceed db connection to $ENDPOINT
    - echo proceed db migrate update, DDL proceed here
    - echo proceed application test, CRUD test run here
post_build:
  commands:
    - echo Build completed on `date`
    - echo $DBNAME
    - aws rds delete-db-cluster --db-cluster-identifier $DBIDENTIFIER --skip-final-snapshot &

 

After you finish editing the file, name it buildspec.yml. Save it in the root directory with which you plan to build, then commit the file into your code repository.

  1. Open the CodeBuild console.
  2. Choose Create build project.
  3. In Project Configuration, enter the name and description for the build project.
  4. In Source, select the source provider for your code repository.
  5. In Environment image, choose Managed image, Ubuntu, and the latest runtime version.
  6. Choose the appropriate service role for your project.
  7. In the Additional configuration menu, select the VPC with your Amazon RDS database snapshots, as shown in the following screenshot, and then select Validate VPC Settings. For more information, see Use CodeBuild with Amazon Virtual Private Cloud.
  8. In Security Groups, select the security group needed for the CodeBuild environment to access your temporary database.
  9. In Build Specifications, select Use a buildspec file.

CodeBuild Project Additional Configuration - VPC

Grant permission for the build project

Follow these steps to grant permission.

  1. Navigate to the AWS Management Console Policies.
  2. Choose Create a policy and select the JSON tab.To give CodeBuild access to the Amazon RDS resource in the pre_build stage, you must grant RestoreDBClusterFromSnapshot and DeleteDBCluster. Follow the least privilege guideline and limit the DeleteDBCluster action point to “arn:aws:rds:*:*:cluster: db-cluster-identifier-*”.
  3. Copy the following code and paste it into your policy:
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "VisualEditor0",
          "Effect": "Allow",
          "Action": "rds:RestoreDBClusterFromSnapshot",
          "Resource": "*"
        },
        {
          "Sid": "VisualEditor1",
          "Effect": "Allow",
          "Action": "rds:DeleteDBCluster*",
          "Resource": "arn:aws:rds:*:*:cluster:db-cluster-identifier-*"
        }
      ]
    }
  4. Choose Review Policy.
  5. Enter a Name and Description for this policy, then choose Create Policy.
  6. After the policy is ready, attach it to your CodeBuild service role, as shown in the following screenshot.

Attach created policy to IAM role

Use database snapshot restore to launch the build process

  1. Navigate back to CodeBuild and locate the project you just created.
  2. Give an appropriate timeout setting and make sure that you set it to the correct branch for your repository.
  3. Choose Start Build.
  4. Open the Build Log to view the database cluster from your snapshot in the pre_build stage, as shown in the following screenshot.CodeBuild ProjectBuild Log - pre_build stage
  5. In the build stage, use $ENDPOINT to point your application to this temporary database, as shown in the following screenshot.CodeBuild Project Build Log - build stage
  6. In the post_build, delete the cluster, as shown in the following screenshot.CodeBuild Project Build log - post build stage

Test your database schema change

After you set up this pipeline, you can begin to test your database schema change within your application code. This example defines several steps in the Build Specifications file to migrate the schema and run with the latest application code. In this example, you can verify that all the modifications fit from the application to the database.

YAML
build:
  commands:
    - echo Build started on `date`
    - echo proceed db connection to $ENDPOINT
    # run a script to apply your latest schema change
    - echo proceed db migrate update
    # start the latest code, and run your own testing
    - echo proceed application test

After validation

After we validated the database schema change in the above steps, a suitable strategy for deployment to production should be utilized that would align with the criteria to satisfy the business goals.

Cleaning up

To avoid incurring future charges, delete the resources as following steps:

  1. Open the CodeBuild console
  2. Click the project you created for this test.
  3. Click the delete build project and input delete to confirm deletion.

Conclusion

In this post, you created a mechanism to set up a temporary database and limit access into the build runtime. The temporary database stands alone and isolated. This mechanism can be applied to secure the permission control for the database snapshot, or not to break any existing environment. The database engine applies to all available RDS options, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server. This provides options, without impacting any existing environments, for critical events triggered by major changes in the production database schema, or data format changes required by business decisions.

Author:

Jun-Tin Yeh (Bob)

Jun-Tin Yeh (Bob), a Geo Solutions Architect from Well-Architected Team, APAC, AWS. He enjoys working with customer to evolve their architecture in fast pace, also dive deep into any Well-Architected related subject for better user cloud experiences.