What is Database Storage?

Data can be stored in a database for future retrieval and analysis. There are many different database storage options with control and flexibility trade-offs. This guide examines various storage models on AWS and offers guidance on selecting the most suitable one for your specific use case.

All digital data must be stored on a physical device somewhere, such as in RAM, cache, registers, or on solid-state drives (SSDs) or hard disk drives (HDDs), on network-attached storage (NAS) and storage area networks (SANs), or in other types of physical storage. Cloud data storage relies on underlying physical devices that run logical storage using virtualization over the top.

For most enterprise data to be useful, you must store it in databases. A database management system is a software layer that sits on top of stored data. This software layer enables you to perform operations such as creating databases, querying and analyzing data, and updating and deleting data. The database software can be stored physically separately from the data itself.

How does relational database storage work?

Relational databases store data in a table format, consisting of rows and columns, where each row represents a record and each column within that record is an attribute. The database table's data structure stores common, repeated, and related records, such as customer details or purchase records. Each table can be associated with another table as a relation: a purchase record can have a relation to an associated customer.

The relational database software that runs on top of these tables manages the associations through primary and foreign keys. It allows users to perform create, read, update, delete, and write operations, and query the data within the tables using Structured Query Language (SQL).

Storing relational databases and their underlying structured data depends on which relational database management system (RDBMS) was used to create the database, as different software products approach data storage differently. Examples include SQL Server, MySQL, PostgreSQL, Oracle, and MariaDB.

Managed relational database storage

AWS offers managed relational database storage and operating systems for various relational database management systems. The benefits of using a managed service include reduced time spent on infrastructure management and maintenance, and increased security.

Amazon Relational Database Service

Amazon Relational Database Service (RDS) is a managed service for systems such as PostgreSQL, MySQL, MariaDB, SQL Server, Oracle, and Db2. Amazon RDS handles database management tasks, such as provisioning, patching, backup, recovery, failure detection, and repair, and is straightforward to set up and deploy.

For storage, Amazon RDS offers a choice of three different underlying Amazon Elastic Block Store (Amazon EBS) volume types.

General-purpose SSD-backed storage for most database workloads,
High-performance provisioned IOPS SSD-backed storage, and
Magnetic data storage for backwards compatibility.

Amazon Aurora

Amazon Aurora is a cloud-native managed service for PostgreSQL, MySQL, and DSQL relational databases. Aurora is designed to maximize the full benefits of cloud configurations, including clustering and distribution, providing higher performance, high availability, and fault tolerance compared to traditional cloud-based RDBMS services.

For storage, Amazon Aurora data is stored in a cluster volume, a custom, single, virtual volume backed by SSDs, with this data replicated across three different Availability Zones within an AWS region for maximum data integrity and data redundancy. Amazon Aurora DSQL offers multi-region redundancy to preserve access to data when a regional endpoint is unavailable. Because Aurora storage is proprietary, it provides custom configurations with full auto scaling and is fully managed by AWS, eliminating the need for user-based storage customization.

Self-managed relational database storage

Self-managed RDBMS and storage on AWS involves traditional system administration and database management tasks. Instead of performing these tasks on your physical infrastructure, you need to perform the tasks on cloud infrastructure.

Amazon EC2 allows you to set up and configure an instance for any type of relational database management system. Configuring and running EC2 instances requires tasks such as security management, performance configuration, monitoring, and maintenance.

For the underlying storage, you can choose from Amazon EBS, Amazon Elastic File System (EFS) for fully elastic storage, and temporary instance stores. You can choose between large and small volumes for storage, depending on your database needs.

How does non-relational database storage work?

Nonrelational databases, also known as NoSQL databases, store, access, and model data differently from relational databases, utilizing distinct data structures. A variety of nonrelational databases support different use cases, with each data store designed with a different database structure.

Non-relational databases include: key-value store databases, document databases, wide-column databases, graph databases, in-memory databases, and search databases.

Managed nonrelational databases

AWS offers a range of managed services for each type of nonrelational database.

Amazon DynamoDB is a managed key-value store and document database service that utilizes custom, distributed, SSD-based storage under the hood.
Amazon DocumentDB (with MongoDB compatibility) is a native JSON document database managed service that uses custom, distributed, SSD-based storage.
Amazon Keyspaces (for Apache Cassandra) is an Apache Cassandra–compatible wide-column database managed service that uses custom, distributed, SSD-based storage.
Amazon Neptune is a graph database managed service with custom, distributed, SSD-based storage.
Amazon MemoryDB is a Valkey- and Redis OSS-compatible in-memory database service with custom, distributed, SSD-based storage.
Amazon ElastiCache is an in-memory caching service compatible with Valkey, Redis, and Memcached in-memory cache, backed by RAM and EBS data storage.

Amazon DynamoDB, Amazon DocumentDB, Amazon Keyspaces, Amazon Neptune, and Amazon MemoryDB all use custom, proprietary SSD-backed storage types.

Although Amazon ElastiCache leverages EBS storage, it does not offer storage-based choices or user access to file-level storage. ElastiCache is a cache type nonrelational database.

Self-managed nonrelational databases

Configuring and storing nonrelational databases on AWS follows a similar infrastructure pattern to that used for relational databases.

You can use EC2 instances to run any type of NoSQL database, including MongoDB, Redis, and HBase. The underlying data can be stored on Amazon EBS, Amazon Elastic File System (EFS) for fully elastic storage, and temporary instance stores, depending on your use case.

What are other types of database storage?

Not all enterprise data fits neatly into relational or nonrelational database formats, and modern analytics can often accommodate other semi-structured and unstructured data types.

For example, you can store semi-structured data in Apache Avro data files on Amazon S3 and analyze the data as-is, rather than restructuring the data to fit into a database. You can use S3 as a storage solution for any type of data.

How to choose between database storage types?

Deciding between using a managed database service versus a self-managed service determines the choices you have in data storage.

Full environment control

Organizations seeking full control over their database environment must opt for self-managed database solutions on AWS. You can use self-managed databases and storage for both relational and nonrelational databases. Using an EC2 self-managed solution, you can directly access the underlying data in your file system storage, whether it is stored in EBS, EFS, or instance stores.

Reduced overheads

Managed services offer little to no customization in terms of where and how data is stored, but the trade-off is a reduction in the overheads associated with infrastructure management. Organizations typically migrate to the cloud to gain access to managed services, reducing the need for infrastructure management and maintenance.

However, there are use cases where organizations need file-level access to the underlying data of databases. For instance, an existing application might access data directly from a file, air-gapped systems might need this configuration, or compliance data integrity obligations might demand file-level access.

Choosing a managed database service versus a self-managed database configuration depends on the unique use case of each database. Carefully considering each database within your organization, including its existing configuration and requirements, helps guide your decision-making process.

Your solution must include a backup system that meets data redundancy requirements in the event of system failures.

How can AWS support your database storage needs?

Database storage on AWS is more straightforward if you choose managed database services. Each managed service takes care of the storage for you, handling data efficiently without extra configuration from your administrators. Using managed services means that AWS is your no-touch storage manager.

If you take a self-managed approach to databases on AWS, you have control over how you would like to store data. A self-managed approach allows accessing and retrieving data directly from physical storage.

Whether you’re performing a MySQL migration or creating a new key-value store, explore your database options on AWS.

Get started with building a modern database infrastructure that fits your needs by creating a free account on AWS today.

What is Database Storage?