Amazon Web Services (AWS) offers a growing number of purpose-built database options (currently more than 15) to support diverse data models. These include relational, key-value, document, in-memory, graph, time series, wide column, and ledger databases.
Choosing the right database or multiple databases requires you to make a series of decisions based on your organizational needs. This decision guide will help you ask the right questions, provide a clear path for implementation, and help you migrate from your existing database.
This six and a half minute video from AWS developer advocate Ricardo Ferreira explains the basics of choosing an AWS database, providing a strong introduction to the concepts, criteria and choices available to you in the rest of this decision guide.
Databases are important backend systems used to store data for any type of app, whether it’s a small mobile app or an enterprise app with internet-scale and real-time requirements.
This decision guide is designed to help you understand the range of choices available to you, establish the criteria that make sense for you to make your database choice, provide you with detailed information on the unique properties of each database - and then allow you to dive deeper into the capabilities that each offers.
What kinds of apps do people build using databases?
- Internet-scale apps: Globally distributed and internet-scale apps that handle millions of requests per second over hundreds of terabytes of data. These databases automatically scale up and down to accommodate your spiky workloads.
- Real-time apps: Real-time apps such as caching, session stores, gaming leaderboards, ride-hailing, ad-targeting, and real-time analytics need microsecond latency and high throughput to support millions of requests per second.
- Open-source apps: Some customers prefer open-source databases for their low cost, community-backed development and support, and large ecosystems of tools and extensions.
- Enterprise apps: Enterprise apps manage core business processes, such as sales, billing, customer service, human resources, and line-of-business processes, such as a reservation system at a hotel chain or a risk-management system at an insurance company. These apps need databases that are fast, scalable, secure, available, and reliable.
Note: This guide focuses on databases suitable for Online Transaction Processing (OLTP) applications. If you primarily need to store and analyze massive amounts of data quickly and efficiently (typically met by an online analytical processing (OLAP) application), AWS offers Amazon Redshift, a fully-managed, cloud-based data warehousing service that is designed to handle large-scale analytics workloads.
There are two high-level categories of AWS OLTP databases - relational and non-relational.
- The AWS relational database family includes seven popular engines for Amazon RDS and Amazon Aurora — Amazon Aurora with MySQL compatibility, Amazon Aurora with PostgreSQL compatibility, MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server — and an option to deploy on-premises with Amazon RDS on AWS Outposts.
- The non-relational database options are designed for those who have a specific need for key-value, document, caching, in-memory, graph, time series, wide column, and ledger databases.
We'll explore all of these in detail in the Choose section of this guide.
Before deciding which database service you want to use to work with your data, you may want to spend a little time thinking about how you're going to migrate your existing database(s).
The best database migration strategy helps you take full advantage of the AWS Cloud. This involves migrating your applications to use purpose-built, cloud-centered databases. It also doesn't tie you to the same database that you've been using on premises. Consider modernizing your applications and choose the databases that best suit your applications’ workflow requirements.
The following resources can help you with your migration strategy:
- Getting started with AWS Database Migration Service
- A high-level overview of AWS Database Migration Service
- Using the AWS Schema Conversion Tool
- Selecting the right database and database migration plan for your workloads
In addition to having a migration strategy at the front end of your planning, you want to have ways to gain insight from your data. You can use Amazon Redshift. It's a fast, fully managed, petabyte-scale data warehouse service that you can use to efficiently analyze all your data using your existing business intelligence tools. It's optimized for datasets that range from a few hundred gigabytes to a petabyte or more.
You’re considering hosting a database on AWS. This might be to support a greenfield/pilot project as a first step in your cloud migration journey, or you might want to migrate an existing workload with as little disruption as possible. Or perhaps you would like to port your workload to managed AWS services or even refactor it to fully cloud-native.
Whatever your goal, considering the right questions will make your database decision easier. Here’s a summary of the key criteria to consider.
The first major consideration when choosing your database is your business objective. What is the strategic direction driving your organization to change? As suggested in the 7 Rs of AWS, consider whether you want to re-architect or re-factor an existing workload, move to a new platform to shed commercial license commitments, rehost your existing databases and data to the cloud without making any changes to take advantage of cloud capabilities, or make the move now to a managed database strategy.
You can choose a rehosting strategy to deploy to the cloud faster, with fewer data migration headaches. Install your database engine software on one or more EC2 instances, migrate your data, and manage this database instance much as you do on-premises. Alternatively, you can choose a replatform strategy where you migrate your on-premises relational database to a fully-managed Amazon RDS instance.
Finally, you may consider this an opportunity to refactor your workload to be cloud-native, making use of purpose-built NoSQL databases such as Amazon DynamoDB and Amazon DocumentDB with MongoDB compatibility. And if you want to move to a serverless footprint to eliminate the burden of infrastructure management and capacity planning, AWS offers serverless offerings for many of its database such as Amazon Aurora Serverless and Amazon Neptune serverless, the graph database.
Do you need a database built for a specific purpose? As you might have read, the days of the one-size-fits-all monolithic database are behind us. It's now much more common to choose a purpose-built database that is optimized for a particular task or use case.
AWS offers a broad and deep portfolio of purpose-built databases that support diverse data models. With these databases, you can build data-driven, highly scalable, distributed applications. Selecting the right purpose-built database—optimized for what you need to do—will speed development and deployment.
The core of any database choice includes the characteristics of the data that you need to store, retrieve, analyze, and work with. This includes your data model (is it relational, structured or semi-structured, using a highly connected dataset, or time-series?), data access (how do you need to access your data?), the extent to which you need real-time data, and whether there is a particular data record size you have in mind.
Your primary operational considerations are all about where your data is going to live and how it will be managed. The two key choices you need to make are:
Whether it will be self-hosted or fully managed: The core question here is where your team going to provide the most value to the business? If the database is self-hosted, you will be responsible for the real differentiated value that a database can deliver (through your work on schema design, query construction and query optimization), and responsible for the day-to-day maintenance, monitoring and patching of the database. Choosing a fully-managed AWS database simplifies your work and allows your team to focus on where it's likely to deliver unique value.
- Whether you need a serverless or provisioned database: Amazon Aurora provides a model for how to think about this choice. Amazon Aurora Serverless v2 is suitable for demanding, highly variable workloads. For example, your database usage might be heavy for a short period of time, followed by long periods of light activity or no activity at all. Some examples are retail, gaming, or sports websites with periodic promotional events, and databases that produce reports when needed. Aurora provisioned clusters are suitable for steady workloads. With provisioned clusters, you choose a Aurora instance class that has a predefined amount of memory, CPU power, and I/O bandwidth.
Database reliability is key for any business. Achieving and maintaining the reliability and resiliency of your database means paying attention to a number of key factors. These factors include capabilities for backup and restore, replication, failover, and point-in-time recovery (PITR).
In addition, support for a globally distributed application/dataset might be important for you, along with Recovery Time Objective (RTO) / Recovery Point Objective (RPO) requirements.
Consider whether your workload throughput might exceed the capacity of a single compute node. Then consider your potential need for the database to support a high concurrency of transactions (10,000 or more) and whether it needs to be deployed in multiple geographic regions.
Security is a shared responsibility between AWS and you. The AWS shared responsibility model describes this as security of the cloud and security in the cloud. Specific security considerations include data protection at all levels of your data, authentication, compliance, data security, storage of sensitive data and support for auditing requirements.
Now that you know the criteria by which you will be evaluating your database options, you are ready to choose which AWS database is right for your organizational needs.
This table highlights which databases are optimized for which circumstances and type of data. Use it to help determine the database that is the best fit for your use case.
Now that you have learned about the shape of your data, how it fits in your environment, supports your use case, and what each database service is optimized for. You should have been able to select which AWS database service(s) is optimized for your organizational needs.
To explore how to use and learn more about each of the available AWS database services - we have provided a pathway to explore how each of the services work. The following section provides links to in-depth documentation, hands-on tutorials, and resources to get you started.
Create a high-availability database
Learn how to configure an Amazon Aurora cluster to create a high-availability database. This database consists of compute nodes that are replicated across multiple Availability Zones to provide increased read scalability and failover protection.
Use Amazon Aurora global databases
We help you get started using Aurora global databases. This guide outlines the supported engines and AWS Region availability for Aurora global databases with Aurora MySQL and Aurora PostgreSQL.
Migrate from Amazon RDS for MySQL to Amazon Aurora MySQL
We show you how to migrate any application's database from Amazon RDS for MySQL to Amazon Aurora MySQL with minimal downtime. This tutorial is not within the free tier and will cost you less than $1.
Create a serverless message processing application
We show you how to create a serverless message processing application with Amazon Aurora Serverless (PostgreSQL-compatible edition), Data API for Aurora Serverless, AWS Lambda, and Amazon SNS.
Getting started with Amazon DocumentDB
We help you get started using Amazon DocumentDB in just seven steps. This guide uses AWS Cloud9 to connect and query your cluster using the MongoDB shell directly from the AWS Management Console.
Explore the guide »
Setting up a document database with Amazon DocumentDB
This tutorial helps you get started connecting to your Amazon DocumentDB cluster from your AWS Cloud9 environment with a MongoDB shell and run a few queries.
Best practices for working with Amazon DocumentDB
Learn best practices for working with Amazon DocumentDB (with MongoDB compatibility), along with the basic operational guidelines when working with it.
Explore the guide »
Migrate from MongoDB to Amazon DocumentDB
Learn how to migrate an existing self-managed MongoDB database to a fully managed database on Amazon DocumentDB (with MongoDB compatibility).
Assessing MongoDB compatibility
Use the Amazon DocumentDB compatibility tool to help you assess the compatibility of a MongoDB application by using the application’s source code or MongoDB server profile logs.
Getting started with Amazon DynamoDB
We help you get started and learn more about Amazon DynamoDB. This guide includes hands-on tutorials and basic concepts.
Getting started with DynamoDB and the AWS SDKs
We help you get started with Amazon DynamoDB and the AWS SDKs. This guide includes hands-on tutorials that show you how to run code examples in DynamoDB.
Explore the guide »
Create and Query a NoSQL Table with Amazon DynamoDB
Learn how to create a simple table, add data, scan and query the data, delete data, and delete the table using the Amazon DynamoDB console.
Create an Amazon DynamoDB table
We show you how to create a DynamoDB table and use the table to store and retrieve data. This tutorial uses an online bookstore application as a guiding example.
Documentation for Amazon ElastiCache
Explore the full set of Amazon ElastiCache documentation, including user guides for ElastiCache for Redis and ElastiCache for Memcached, as well as specific AWS CLI and API references.
Getting started with Amazon ElastiCache for Redis
Learn how to create, grant access to, connect to, and delete a Redis (cluster mode disabled) cluster using the Amazon ElastiCache console.
Build a fast session store for an online application
Learn how to use Amazon ElastiCache for Redis as a distributed cache for session management. You will also learn the best practices for configuring your ElastiCache nodes and how to handle the sessions from your application.
Setting up a Redis Cluster for scalability and high availability
Learn how to create and configure a Redis Cluster with ElastiCache for Redis version 7.0 with TLS-encryption enabled. With cluster mode enabled, your Redis Cluster gains enhanced scalability and high availability.
Getting started with Amazon Keyspaces (for Apache Cassandra)
This guide is for those who are new to Apache Cassandra and Amazon Keyspaces (for Apache Cassandra). It walks you through installing all the programs and drivers that you need to successfully use Amazon Keyspaces.
Run Apache Cassandra workloads with Amazon Keyspaces
Learn how to create your cluster and build graph models using Property Graph and W3C’s RDF. Learn how to write queries using Apache TinkerPop Gremlin, SPARQL, troubleshoot performance, and integrate with AWS Glue and Elasticsearch.
Beginner course on using Amazon Keyspaces
Learn the benefits, typical use cases, and technical concepts of Amazon Keyspaces. You can try the service through the sample code provided or the interactive tool in the AWS Management Console.
Getting started with Amazon MemoryDB
We guide you through the steps to create, grant access to, connect to, and delete a MemoryDB cluster using the MemoryDB Management Console.
Getting started using Amazon MemoryDB
Learn how to simplify your architecture and use MemoryDB as a single, primary database instead of using a low-latency cache in front of a durable database.
Integrating Amazon MemoryDB for Redis with Java-based AWS Lambda
We discuss some of the common use cases for the data store, Amazon MemoryDB for Redis, which is built to provide durability and faster reads and writes.
Getting started with Amazon Neptune
We help you get started using Amazon Neptune, a fully managed graph database service. This guide shows you how to create a Neptune database.
Build a fraud detection service using Amazon Neptune
We walk you through the steps to create a Neptune database, design your data model, and use the database in your application.
Build a recommendation engine with Amazon Neptune
We show you how to build a friend recommendation engine for a multiplayer game application using Amazon Neptune.
Getting started with Amazon QLDB
In Amazon Quantum Ledger Database (Amazon QLDB), the journal is the core of the database. This guide provides a high-level overview of Amazon QLDB service components and how they interact.
Creating your first Amazon QLDB ledger
We guide you through the steps to create your first Amazon QLDB sample ledger and populate it with tables and sample data.
Using an Amazon QLDB driver with an AWS SDK
Learn how to use the Amazon QLDB driver with an AWS SDK to create a QLDB ledger and populate it with sample data. The driver lets your application interact with QLDB using the transactional data API.
Getting started with Amazon RDS
We explain how to create and connect to a DB instance using Amazon RDS. You learn to create a DB instance that uses MariaDB, MySQL, Microsoft SQL Server, Oracle, or PostgreSQL.
Getting started creating a MySQL DB instance
We show you how to create an Amazon RDS MySQL database instance using the AWS Management Console and use standard MySQL utilities such as MySQL Workbench to connect to a database on the DB instance.
Explore the guide »
Create a web server and an Amazon RDS DB instance
Learn how to install an Apache web server with PHP and create a MySQL database. The web server runs on an Amazon EC2 instance using Amazon Linux, and the MySQL database is a MySQL DB instance.
Create and Connect to a MySQL Database
Learn how to create an environment to run your MySQL database, connect to the database, and delete the DB instance. We will do this using Amazon RDS and everything done in this tutorial is Free Tier eligible.
Getting started with Amazon Timestream
We help you get started with Amazon Timestream. This guide provides instructions for setting up a fully functional sample application.
Best practices with Amazon Timestream
We explore best practices, including those relating to data modeling, security, configuration, data ingestion, queries, client applications and supported integrations.