AWS Insights

Let us manage your relational database!

Earlier this year I ran a poll as part of an effort to better understand why some AWS customers choose to host their own databases in the cloud. Here are the results:

While you will always have the ability to self-manage your database on top of one or more Amazon Elastic Compute Cloud (Amazon EC2) instances, my colleagues and I believe that Amazon Relational Database Service (RDS) and Amazon Aurora are often better choices. I would like to use this post to address the results above and to see if I can get you to reconsider!

Due to space considerations, I have focused on relational databases in this post. However, I can make a similar case for non-relational databases. And with that, let’s get to it…

A quick RDS / Aurora recap
First launched in 2009 (Introducing Amazon RDS – The Amazon Relational Database Service) with support for MySQL, Amazon RDS now lets you choose between eight database engines, with the ability to select past, current, and even (via the preview environment) upcoming beta, release candidate, and early production versions of each engine. When I wrote that 2009 blog post to launch RDS, I said:

We are always looking for ways to make it faster, simpler, and more fun to develop applications of all types. Every hour that you don’t spend fiddling with hardware, tracing cables, installing operating systems or managing databases, is an hour that you can spend on the unique and value-added aspects of your application.

This is as true today as it was 15 years ago. You get cost-effective scaling of compute power and storage, easier capacity planning, and pay-as-you-go pricing, along with several reserved options. You get plenty of tool support, including the power to set up database instances using AWS CloudFormation templates, API calls (CreateDBInstance), and command-line scripts. You also get to take advantage of our long track record of operational excellence: we take care of availability, storage durability, and disaster recovery.

You can launch any of the eight database engines in any one of the 33 current AWS regions, choosing between hundreds of supported database engine versions in each region, all on several dozen database instance types. You can choose between multiple types of SSD storage, with options for automatic storage scaling and the ability to dial in the desired level of I/O performance.

Launched in 2015 (Now Available – Amazon Aurora), Amazon Aurora is a fully managed relational database that was built from the ground up to deliver high performance and availability at a global scale, with full MySQL and PostgreSQL compatibility.

Data security and privacy
Keeping your data safe and secure is of paramount importance to us. This starts with full control over where and how your data is secured: your data is stored in a particular geographic Region, and moves from one Region to another only with your consent. You use AWS Identity and Access Management (IAM) to enable and control programmatic access to the database instance, security groups to enable network access, and the native permission model of each database engine to control access to tables, stored procedures, and other database entities. You have multiple options for protecting data at rest and in transit, and can use your own keys or AWS-managed keys. You can also take advantage of specialized security features offered by the database engine.

As I noted earlier, you also have access to multiple versions of each database engine. Should a security issue arise in the version that you are using, you can easily and efficiently test your code against a newer version before upgrading, without having to procure or set up hardware that you will need for just a short time.

RDS also makes it easy for you to create, manage, and restore backups. You can create them on demand or use AWS Backup to establish a backup schedule and manage lifecycle of each backup. You can arrange to transition older backups from warm storage to cold storage, keeping them available while minimizing storage costs. Last but not least, you can use AWS Backup’s restore testing features to evaluate the viability of your backups and to monitor restoration time.

I have listed just a few of the most relevant security and privacy features; to learn more be sure to consult these resources:

Customization and control
In addition to being able to choose the AWS region, database engine, engine version, and to configure the desired amount of storage, RDS lets you use parameter groups to systematically and repeatably set and manage parameters that are specific to each database engine.

You can define parameter groups for your organization or for specific database instances and use them instead of hand-editing configuration files. The number of parameters that you can set varies from engine to engine and version to version, but there are generally hundreds. For example, there are currently 553 parameters for MySQL 8.0:

You can also use option groups to enable and configure add-ins and features that are specific to each database engine.

Using parameter groups and option groups across your fleet of database instances will help you to ensure that all of the instances are configured as expected, regardless of the size and/or complexity of the fleet. This can be helpful in situations where you create distinct database instances for development, testing, staging, and production. It can also be helpful when you want to systematically vary a parameter or two in order to do performance testing. You can create parameter groups and option groups using CloudFormation templates, and then apply changes uniformly and efficiently across your fleet with ease.

If you need even more customization options at the operating system and/or database level, be sure to take a look at Amazon RDS Custom. You bring your own media, choose between bringing your own license or using the license-included option, and pay for the database instance, storage, and data transfer directly. You get the benefits of Amazon RDS, including the automated administration of tasks and operations described in this post.

Higher costs with managed databases
When I think about the ways to compare the costs of a managed database to a self-managed one, a lot of factors come to mind. For example:

Reliability and availability – Many of the reliability and availability features that I mentioned above will take a considerable amount of time and testing to get right, including “testing in production” every time a new failure scenario arises. What is the cost of downtime in your business, and wouldn’t it be better to simply use a database that was built with reliability and availability as primary objectives?

RDS lets you create Multi-AZ database instances with synchronous replication and failover support, all by checking a box or two when you create the instance. To me, this feature alone shows why managed databases are better. Imagine scripting your own failover support, and then trying to first envision and then fully simulate each possible failure scenario to make sure that you failover as planned. We’ve already done that heavy lifting for you and for every other RDS customer, and we know that it will get the job done. In my experience, getting this right on your own takes years and can be painful and time-consuming.

Replication is similarly tricky to configure and test, and don’t forget about the combination of replication and failover!

Scale – There are many ways to think about this. To me, many of them revolve around flexibility and the power to be able to adapt to changing business requirements at lightning speed. You can start with a 20 GB MySQL database hosted on an inexpensive (or free), yet capable db.t4g.micro instance running in a single Availability Zone, then smoothly and almost effortlessly scale all the way up to a Multi-AZ 64 TiB database running on a db.x2iedn.32xlarge instance with 128 vCPUs, 4 TiB of RAM, and a plethora of read replicas. Along the way you can use (and pay for) only the amount of instance compute power and storage that you need.

Opportunity cost – How much time and money do you want to spend recreating the wheel, when you could be building a flying car instead? Seriously, there are many different ways to partition your development budget. Investing money in rebuilding database features that already exist as part of a managed database means that you will have less money to spend on differentiating features. If you want to attract and retain talented developers, let them innovate instead of asking them to build low-level database monitoring and failover features.

Need multi-cloud portability
Coming in at 12% in the survey, this is still kind of interesting. You can certainly choose to avoid using the managed databases offered by AWS and other cloud providers, and take a purely DIY approach. However, in most cases you can use managed databases, but create your own native backups and find ways to share queries and management tools across multiple clouds. As is the case with this entire post, I would be interested in your thoughts on this topic.

That’s all for today
Finally, let’s not forget about generative AI. While the future is still wide open and we are still in the early days of this amazing new technology, it seems to me that a lot of the most interesting use cases are going to be data-intensive, and will require a considerable amount of flexibility and power behind the scenes. Using a managed database service will help you when you are ready to start putting generative AI to use in your organization.

I hope that this post has been helpful to you, and that it might get you to consider using our managed database services rather than hosting your own. Please feel free to let me know what you think!

— Jeff;