AWS Database Blog

Build purpose-built database AMIs using Amazon EC2 Image Builder

Managing virtual machine images that you standardize through configuration, consistent security patching, and hardening (also called “golden images”) is a time-consuming task. System administrators and database administrators responsible for these tasks have to define the characteristics of these images (such as which software to pre-install, which versions to use, and which security configurations to apply). Once created, the ongoing challenge is how to maintain these configuration settings efficiently.

Golden images are used to create various types of servers, including application servers, web servers, and database servers. Another common server type is the bastion host, also called a “jump box” or “bridge host”. This type of server is used by administrators to connect to and manage resources that, for security reasons, run in the private layers of their networks. A typical example of a bastion host is the database client, used by database administrators (DBAs) to manage their databases.

In the recent years, many AWS customers have moved from a “one-size-fits-all” approach, where they would run their applications on a monolithic database, to a new approach where highly distributed applications use a multitude of purpose-built databases. Because of that, DBAs need database clients with all the necessary software to manage these different types of SQL and NoSQL databases.

In this post, we show you how to create and maintain a purpose-built database Amazon Machine Image (AMI) using Amazon EC2 Image Builder, and how to use this AMI to launch database clients for your purpose-built database architectures.

Purpose-built databases

Thirty years ago, database options were limited. The relational database rose to the top and was widely used in nearly all applications. This made it easier to choose the database in your application architecture, but it limited the types of applications you could build. A relational database is like a Swiss army knife: it can do many things, but isn’t perfectly suited to every particular task.

In the last three decades, we’ve seen a rapid shift in the database landscape. The introduction of internet-enabled applications has changed the demands we place on our databases. As a result, developers now reach for databases that can go faster and more global than ever before. Additionally, the rise of cloud computing has changed what’s possible technically because we can build more resilient, scalable applications in an economical way.

These changes have led to the rise of the purpose-built database. Developers don’t need to reach for the default relational database anymore. They can carefully consider the needs of their application and choose a database tailored to those needs. For mobile applications with heterogeneous data records, you can choose a document database that provides great scalability and performance. For tightly related data, you can use a graph database to explore hidden connections between records. For high-speed applications, you can reach for an in-memory cache database to get sub-millisecond response times.

To read more about what a purpose-built architecture looks like, see Build a Modern Application with Purpose-Built AWS Databases.

AMIs and Image Builder

An AMI (or simply an image) provides the information required to launch a completely configured Amazon Elastic Compute Cloud (Amazon EC2) instance, including the operating system, storage, launch permissions and specified software installed. You must specify an AMI when you launch an instance.

You can find pre-built images managed by AWS, in the AWS Marketplace, as a Community AMI made available by AWS community members, or you can create your own private, customized AMIs.

Image Builder simplifies the creation, maintenance, validation, sharing, and deployment of Linux or Windows images and container images for use on AWS and on-premises.

Keeping images up to date can be time consuming, resource intensive, and error-prone. You either manually update and snapshot your virtual machines or have teams that build automation scripts to maintain images.

Image Builder significantly reduces the effort of keeping images up to date and secure by providing a simple graphical interface, built-in automation, and AWS-provided security settings. Image Builder automates the process of updating images without the need to build your own automation pipeline.

The following is a graphical representation that outlines the process.

The process includes the following steps:

  1. You run an Image Builder pipeline that provides an automation framework for the image creation process. The pipeline is associated with a recipe that includes the source image (CentOS was selected for this project), build components, and test components. The pipeline also includes an infrastructure configuration and distribution settings (for more details about the resources involved in the process, see Manage EC2 Image Builder resources).
  2. Build components configure the source image with several custom scripts that were defined for this project. Most of them install and configure database tools that DBAs can use to connect and manage their purpose-built databases (more about these components in the next section).
  3. Image Builder, based on the OS of the source image, performs actions to secure the output image (such as make sure security patches are applied, enforce strong passwords, or turn on full disk encryption).
  4. Optionally, test components are run against the output image.
  5. The output image, once configured and tested, is ready to be distributed based on the distribution settings specified.

You can use Image Builder via the AWS Management Console, AWS Command Line Interface (AWS CLI), or APIs. It’s provided at no cost to customers and is available in all commercial AWS Regions. You’re charged only for the underlying AWS resources that you use to create, store, and share the images.

Image Builder components

The Image Builder pipeline defined for this post is associated with a recipe made by custom documents. We use them to download, install and configure the software needed to connect and manage our purpose-built databases.

The following table lists the custom components defined for this project.

Component name Description To use for
folder-and-system-packages Creation of some folders and installation of system tools/packages The database software files
aws-cli-2 Installation of the AWS CLI Version 2 Call AWS API
aws-systems-manager-agent Installation of AWS Systems Manager Agent Connect to the DB client using AWS Systems Manager
mysql-mariadb-client Installation of the MySQL client

Amazon RDS for MySQL,

Amazon RDS for MariaDB,

Amazon Aurora MySQL

postgresql-client Installation of the PostgreSQL client

Amazon RDS for PostgreSQL,

Amazon Aurora PostgreSQL

SCT Installation of the AWS Schema Conversion Tool Database schema conversions
neptune-apache-tinkerpop-console Installation of the Apache TinkerPop Console (including Apache Gremlin)

Amazon Neptune,

self-managed Apache TinkerPop installation

mongodb-tools Installation of the MongoDB shell and tools

Amazon DocumentDB (with MongoDB compatibility),

self-managed MongoDB installations

dynamodblocal-nosqlworkbench Installation of Amazon DynamoDB Local and NoSQL Workbench

Amazon DynamoDB,

Amazon Keyspaces (for Apache Cassandra),

self-managed Apache Cassandra installations

benchmark-tools Installation of the benchmarks Sysbench and Pgbench

Amazon RDS for MySQL,

Amazon RDS for MariaDB,

Amazon RDS for PostgreSQL,

Amazon Aurora MySQL,

Amazon Aurora PostgreSQL

qldb-client Installation of the Amazon QLDB shell (qldbshell) Amazon QLDB
keyspaces-apache-cassandra-client Installation of Apache Cassandra (included chqlsh)

Amazon Keyspaces (for Apache Cassandra),

self-managed Apache Cassandra installations

You can customize this list of components, for example by adding new software to connect to databases not listed in the table or by updating the existing ones when new versions show up.

Prerequisites

Before getting started, you must complete the following prerequisites:

  1. Create an Amazon Simple Storage Service (Amazon S3) bucket, which you use to log the build process and store some files used within the EC2 instance.
  2. Create a directory named “stage” within the bucket.
  3. Download the following ZIP file, unzip it, and upload all the files within the directory “stage”. The file contains some files used within the EC2 instance, including the splash screen users get after logging in.

Provision resources with AWS CloudFormation

We have defined an AWS Cloud Formation template that allows you to quickly provision the Image Builder infrastructure you need, including a pipeline that you run to create your new image.

The template creates the following resources:

  • An Image Builder infrastructure configuration
  • An Image Builder distribution settings
  • 13 Image Builder custom components
  • An Image Builder recipe
  • An Image Builder image pipeline

Complete the following steps:

  1. Launch the CloudFormation stack with the following link (it points to us-west-2, but you can provision the solution in all commercial AWS Regions):
  2. Specify the values for the following input parameters:
  3. paramIamRole : The AWS Identity and Access Management (IAM) role that the EC2 instance assumes when the pipeline runs.
  4. paramKeyPair : The EC2 key pair name that the EC2 instance uses when the pipeline runs.
  5. paramS3bucket : The S3 bucket used to log the entire process and stage some configuration files.
  6. paramSNSTopic : The ARN of the Amazon Simple Notification Service (Amazon SNS) topic used to send notifications at the end of each run of the pipeline.
  7. paramSecGrpId : The ID of the security group that the EC2 instance uses when the pipeline runs.
  8. paramSubId : The ID of the subnet that the EC2 instance uses when the pipeline runs. The security group and subnet must be associated with the same VPC.
  9. Leave all the other values at their defaults and choose Create stack.

The CloudFormation stack creation should take less than a minute.

Create the purpose-built database image

The Image Builder pipeline is now ready to run.

  1. On the Image Builder console, choose “Image pipelines”.
  2. Select the pipeline you just provisioned, called “PurposeBuiltDbAMI-pipeline”.
  3. On the Actions menu, choose Run pipeline.

You can monitor the image creation via the Image Builder console or by looking at the logs stored in the S3 bucket you create. The process could take several minutes and involves the following high-level steps:

  • Create a new private image
  • Launch a new test instance from the new image
  • Validate the test instance
  • Stop the test instance

If the process completes successfully (it should take between 25-30 minutes), the final result is represented by the new purpose-built database image that you can use to launch EC2 instances when needed.

Create your purpose-built database client

You can choose the characteristics of the EC2 instance that you create as a database client based on your needs, in particular how critical the environment that hosts the instance is. The following are some suggestions if you use your database client within a critical production environment:

  • Start with a burstable performance instance as the instance type, considering that you shouldn’t require a high level of sustained CPU performance (the workload of a database client is on average not busy). For the same reason, start with a smaller size and evaluate later if you need something more powerful.
  • Deploy it as an AWS Auto Scaling “steady” group (with minimum, desired, and maximum capacity set to 1) specifying multiple Availability Zones.
  • For cost-optimization, you can also consider creating the instance as a Spot Instance.

To create the database client, you need to launch a new EC2 instance by selecting the AMI just created by Image Builder:

  1. On the Amazon EC2 console, choose Images in the navigation pane.
  2. Filter the list of AMIs by the string “PurposeBuiltDbAMI-FromEC2ImgBuilder”.
  3. Select the AMI and choose Launch.

Access your database client

After you create the database client, you can access it via an EC2 key pair (the OS user to use is centos). The following is the splash screen you get after you log in.

As the splash screen specifies, you can look at the ./docs/software_installed.txt file to see the list of the software installed and at the ./logs directory to see the related installation logs.

In the ./binaries directory, you can find some of the binaries of the software installed in the database client.

User interfaces

The database client has pre-installed Xorg, one of the most popular display-servers among Linux users. If you configure an X client such as XQuartz, you can connect and manage your databases with the software that requires a user interface, such as NoSQL Workbench and AWS Schema Conversion Tool (AWS SCT).

Software ready-to-go

The following software doesn’t need any additional configuration and you can start using them by just invoking them:

  • AWS CLI v2 (aws)
  • MariaDB/MySQL command line client (mysql)
  • PostgreSQL command line client (psql)
  • MongoDB Shell (mongo)
  • MongoDB tools (mongoexport, mongoimport, mongodump , mongorestore, mongostat, mongoperf, mongotop)
  • Sysbench (sysbench)
  • Pgbench (pgbench)
  • QLDB shell (qldbshell)
  • DynamoDB Local (./binaries/ddblocal/DynamoDBLocal.jar)
  • NoSQL Workbench

You need to configure Apache Cassandra and the Apache TinkerPop console to point to the correct databases. The AWS SCT doesn’t need additional configuration to run, but it needs database drivers to communicate with the source and target databases.

AWS CLI v2 and NoSQL Workbench both allow you to run queries against your Amazon DynamoDB tables and indexes using PartiQL.

AWS Managed Services

Automating the entire maintenance process of the source images of your servers is an important task that helps you save time and increase standardization and security. This kind of best practice is just one of the many that AWS internally applies when managing its infrastructure. If you’re interested in increasing your operational flexibility and, enhancing your security, compliance, and cost-optimization you should check out AWS Managed Services (AMS). AMS helps you operate your AWS infrastructure more efficiently and securely, using AWS services and a growing library of automations, configurations, and run books. It can also augment and optimize your operational capabilities in both new and existing AWS environments.

Conclusion

In this post, we showed how to use Image Builder to create a golden image for use as a database client, which DBAs can launch to connect and manage purpose-built databases with all the software pre-installed and ready to use. You can also customize Image Builder components to adapt them based on your needs.

We’d love to hear what you think!

If you have questions or suggestions, please leave a comment.


About the Authors

Paola Lorusso is a Specialist Database Solutions Architect based in Milan, Italy. She works with companies of all sizes to support their innovation initiatives in the database area. In her role she helps customers to discover database services and design solutions on AWS, based on data access patterns and business requirements. She brings her technical experience close to the customer supporting migration strategies and developing new solutions with Relational and NoSQL databases.

Marco Tamassia is a technical instructor based in Milan, Italy. He delivers a wide range of technical trainings to AWS customers across EMEA. He also collaborates in the creation of new courses such as “Planning & Designing Databases on AWS” and “AWS Certified Database – Specialty”. Marco has a deep background as a Database Administrator (DBA) for companies of all sizes (included AWS). This allows him to bring his database knowledge into classroom brining real world examples to his students.