Amazon Aurora features

Why Amazon Aurora?

Amazon Aurora is a relational database service that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. Aurora is fully compatible with MySQL and PostgreSQL, allowing existing applications and tools to run without requiring modification.

High performance and scalability

Testing on standard benchmarks such as SysBench has shown an increase in throughput of up to 5x over stock MySQL and 3x over stock PostgreSQL on similar hardware. Aurora uses a variety of software and hardware techniques to ensure the database engine is able to fully use available compute, memory, and networking. I/O operations use distributed systems techniques, such as quorums to improve performance consistency.

Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Aurora where the database automatically starts up, shuts down, and scales capacity up or down based on your application's needs. With Amazon Aurora Serverless, you can run your database in the cloud without managing any database instances. You can also use Aurora Serverless v2 instances along with provisioned instances in your existing or new database clusters.

You can use the Amazon Relational Database Service (Amazon RDS) APIs or the AWS Management Console to scale provisioned instances powering your deployment up or down. Compute scaling operations typically complete in a few minutes.

Aurora automatically scales I/O to match the needs of your most demanding applications. It also increases the size of your database volume as your storage needs grow. Your volume expands in increments of 10 GB up to a maximum of 128 TiB. You don't need to provision excess storage for your database to handle future growth. When using the Amazon Aurora I/O-Optimized configuration for your database clusters, Aurora also provides up to 40% cost savings when I/O spend exceeds 25% of your Aurora database spend. To learn more, visit Aurora storage and reliability.

You can increase read throughput to support high-volume application requests by creating up to 15 Amazon Aurora Replicas. Aurora Replicas share the same underlying storage as the source instance, lowering costs and avoiding the need to perform writes at the replica nodes. This frees up more processing power to serve read requests and reduces the replica lag time—often down to single-digit milliseconds. Aurora provides a reader endpoint so the application can connect without having to keep track of replicas as they are added and removed. It also supports auto scaling, automatically adding and removing replicas in response to changes in performance metrics that you specify. To learn more, visit Using Amazon Aurora Auto Scaling with Aurora Replicas.

Aurora supports cross-Region read replicas. Cross-Region replicas provide fast local reads to your users, and each region can have an additional 15 Aurora Replicas to further scale local reads. See Amazon Aurora Global Database for details.

Custom endpoints allow you to distribute and load balance workloads across different sets of database instances. For example, you can provision a set of Aurora Replicas to use an instance type with higher memory capacity in order to run an analytics workload. A custom endpoint can then help you route the workload to these appropriately configured instances while keeping other instances isolated from it.

Amazon Aurora Optimized Reads is a new price-performance capability that delivers up to 8x improved query latency and up to 30% cost savings compared to instances without it. It is ideal for applications with large datasets that exceed the memory capacity of a database instance.

Optimized Reads instances use local NVMe-based SSD block-level storage, available on Graviton-based r6gd and Intel-based r6id instances, to improve query latency of applications with data sets exceeding the memory capacity of a database instance. Optimized Reads include performance enhancements such as tiered caching and temporary objects to enable you to make the most of your database instances.

With up to 8x improved query latency, you can effectively run read-heavy, I/O-intensive workloads such as operational dashboards, anomaly detection, and similarity searches with pgvector. Amazon Aurora PostgreSQL Optimized Reads with pgvector increases queries per second for vector search by up to 9x in workloads that exceed available instance memory. Optimized Reads is available for Aurora with PostgreSQL compatibility.

Amazon Aurora Parallel Query provides faster analytical queries over your current data. It can speed up queries by up to two orders of magnitude while maintaining high throughput for your core transaction workload. By pushing query processing down to the Aurora storage layer, it gains a large amount of computing power while reducing network traffic. Use Parallel Query to run transactional and analytical workloads alongside each other in the same Aurora database. Parallel Query is available for Aurora with MySQL compatibility.

Amazon DevOps Guru is a cloud operations service powered by machine learning (ML) that helps improve application availability. With Amazon DevOps Guru for RDS, you can use ML-powered insights to help easily detect and diagnose performance-related relational database issues and is designed to resolve them in minutes rather than days. Developers and DevOps engineers can use DevOps Guru for RDS to automatically identify the root cause of performance issues and get intelligent recommendations to help address the issue, without needing help from database experts.

To get started, simply go to the Amazon RDS Management Console and enable Amazon RDS Performance Insights. Once Performance Insights is on, go to the Amazon DevOps Guru Console and enable it for your Amazon Aurora resources, other supported resources, or your entire account.

High availability and durability

Amazon RDS continuously monitors the health of your Aurora database and underlying Amazon Elastic Compute Cloud (Amazon EC2) instance. In the event of database failure, Amazon RDS will automatically restart the database and associated processes. Aurora does not require crash recovery replay of database redo logs, which greatly reduces restart times. It also isolates the database buffer cache from database processes, which allows the cache to survive a database restart.

On instance failure, Aurora uses Amazon RDS Multi-AZ technology to automate failover to one of up to 15 Aurora Replicas you have created in any three Availability Zones. If no Aurora Replicas have been provisioned, in the case of a failure, Amazon RDS will automatically attempt to create a new Aurora DB instance for you. Minimize failover time by replacing community MySQL and PostgreSQL drivers with the open-source and drop-in compatible AWS JDBC Driver for MySQL and AWS JDBC Driver for PostgreSQL. You may also use RDS Proxy to reduce failover times and improve availability. When failovers occur, Amazon RDS Proxy routes requests directly to the new database instance, reducing failover times by up to 66% while preserving application connections.

For globally distributed applications, you can use an Aurora Global Database, where a single Aurora database can span multiple AWS Regions to enable fast local reads and quick disaster recovery. An Aurora Global Database uses storage-based replication to replicate a database across multiple Regions, with typical latency of less than one second. You can use a secondary Region as a backup option in case you need to quickly recover from a regional degradation or outage. A database in a secondary Region can be promoted to full read/write capabilities in less than 1 minute. To learn more, visit Using Amazon Aurora global databases.

Aurora's database storage volume is segmented in 10 GiB chunks and replicated across three Availability Zones, with each Availability Zone persisting 2 copies of each write. Aurora storage is fault-tolerant, transparently handling the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Aurora storage is also self-healing; data blocks and disks are continuously scanned for errors and replaced automatically.

The backup capability of Aurora enables point-in-time recovery for your instance. This allows you to restore your database to any second during your retention period, up to the last 5 minutes. Your automatic backup retention period can be configured up to 35 days. Automated backups are stored in Amazon Simple Storage Service (Amazon S3), which is designed for 99.999999999% durability. Aurora backups are automatic, incremental, and continuous and have no impact on database performance.

DB snapshots are user-initiated backups of your instance stored in Amazon S3 that will be kept until you explicitly delete them. They leverage the automated incremental snapshots to reduce the time and storage required. You can create a new instance from a DB snapshot whenever you desire.

Backtrack lets you quickly move a database to a prior point in time without needing to restore data from a backup. This lets you quickly recover from user errors, such as dropping the wrong table or deleting the wrong row. When you enable Backtrack, Aurora will retain data records for the specified Backtrack duration. For example, you could set up Backtrack to allow you to move your database up to 72 hours back. Backtrack completes in seconds, even for large databases, because no data records need to be copied. You can go backwards and forwards to find the point just before the error occurred.

Backtrack is also useful for development and test, particularly in situations where your test deletes or otherwise invalidates the data. Simply backtrack to the original database state, and you're ready for another test run. You can create a script that calls Backtrack through an API and then runs the test, for simple integration into your test framework. Backtrack is available for Aurora with MySQL compatibility.

Highly secure

Aurora runs in Amazon Virtual Private Cloud (Amazon VPC), which helps you isolate your database in your own virtual network and connect to your on-premises IT infrastructure using industry-standard encrypted IPsec VPNs. To learn more about Amazon Relational Database Service (RDS) in Amazon VPC, refer to the Amazon RDS User Guide. Also, when using Amazon RDS, you can configure firewall settings and control network access to your DB instances.

Aurora is integrated with AWS Identity and Access Management (IAM) and provides you the ability to control the actions that your IAM users and groups can take on specific Aurora resources (for example, DB instances, DB snapshots, DB parameter groups, DB event subscriptions, DB option groups). Also, you can tag your Aurora resources and control the actions that your IAM users and groups can take on groups of resources that have the same tag (and tag value). For more information about IAM integration, see the IAM database authentication documentation.

Aurora helps you encrypt your databases using keys you create and control through AWS Key Management Service (KMS). On a database instance running with Aurora encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster. Aurora uses SSL (AES-256) to secure data in transit.

Aurora helps you log database events with minimal impact on database performance. Logs can later be analyzed for database management, security, governance, regulatory compliance, and other purposes. You can also monitor activity by sending audit logs to Amazon CloudWatch.

Amazon GuardDuty offers threat detection for Aurora to help you identify potential threats to data stored in Aurora databases. GuardDuty RDS Protection profiles and monitors login activity to existing and new databases in your account and uses tailored ML models to accurately detect suspicious logins to Aurora databases. If a potential threat is detected, GuardDuty generates a security finding that includes database details and rich contextual information on the suspicious activity. Aurora integration with GuardDuty gives direct access to database event logs without requiring you to modify your databases and is designed not to have an impact on database performance.

Cost-effective

There is no upfront commitment with Aurora. You pay an hourly charge for each instance that you launch, and when you’re finished with an Aurora DB instance, you can delete it. You do not need to overprovision storage as a safety margin, and you only pay for the storage you actually consume. To see more details, visit the Aurora pricing page.

Aurora offers the flexibility to optimize your database spend by choosing between two configuration options based on your price-performance and price-predictability needs, regardless of the I/O consumption of your application. The two configuration options are Aurora I/O-Optimized and Aurora Standard. Neither option requires upfront I/O or storage provisioning and both can scale I/O to support your most demanding applications.

Aurora I/O-Optimized is a database cluster configuration. It delivers improved price performance for customers with I/O-intensive workloads such as payment processing systems, ecommerce systems, and financial applications. If your I/O spend exceeds 25% of your total Aurora database spend, you can save up to 40% on costs for I/O-intensive workloads with Aurora I/O-Optimized. With Aurora I/O-Optimized you pay for database instances and storage. There are no charges for read and write I/O operations, providing price predictability for all applications regardless of I/O variability.

Aurora Standard is a database cluster configuration that offers cost-effective pricing for the vast majority of applications with low to moderate I/O usage. With Aurora Standard you pay for database instances, storage, and pay-per-request I/O.

For a heavily analytical application, I/O costs are typically the largest contributor to the database cost. I/O operations are performed by the Aurora database engine against its SSD-based virtualized storage layer. Every database page read operation counts as one I/O. The Aurora database engine issues reads against the storage layer to fetch database pages not present in the buffer cache. Each database page is 8 KB in Aurora with PostgreSQL compatibility and 16 KB in Aurora with MySQL compatibility.

Aurora was designed to eliminate unnecessary I/O operations to reduce costs and ensure resources are available for serving read/write traffic. Write I/O operations are only consumed when pushing transaction log records to the storage layer for the purpose of making writes durable. Write I/O operations are counted in 4 KB units. For example, a transaction log record that is 1,024 bytes counts as one I/O operation. However, concurrent write operations whose transaction log is less than 4 KB can be batched together by the Aurora database engine to optimize I/O consumption. Unlike traditional database engines Aurora never pushes modified database pages to the storage layer, resulting in further I/O consumption savings.

You can see how many I/O operations your Aurora instance is consuming by going to the AWS Management Console. To find your I/O consumption, go to the RDS section of the console, look at your list of instances, select your Aurora instances, then look for the “Billed read operations” and “Billed write operations” metrics in the monitoring section.

You are charged for read and write I/O operations when you configure your database clusters to the Aurora Standard configuration. You are not charged for read and write I/O operations when you configure your database clusters to Aurora I/O-Optimized. For more information on the pricing of I/O operations, visit Amazon Aurora Pricing page.

Aurora Optimized Reads for Aurora PostgreSQL offers customers, with latency-sensitive applications and large working sets, a compelling price-performance alternative to meet their business SLAs. Customers also have more flexibility to grow their datasets without the need to frequently upsize their database instances to obtain larger memory capacity. Optimized Reads includes performance enhancements such as tiered caching and temporary objects.

Tiered caching delivers up to 8x improved query latency and up to 30% cost savings for read-heavy, I/O-intensive applications such as operational dashboards, anomaly detection, and vector-based similarity searches. These benefits are realized as caching data is automatically evicted from the in-memory database buffer cache onto local storage to speed up subsequent accesses of that data. Tiered caching is only available for Aurora PostgreSQL with the Aurora I/O-Optimized configuration.

Temporary objects achieve faster query processing by placing temporary tables generated by Aurora PostgreSQL on local storage, improving the performance of queries involving sorts, hash aggregations, high-load joins, and other data-intensive operations.

Fully managed

Getting started with Aurora is easy. Just launch a new Aurora DB instance using the Amazon RDS Management Console or a single API call or CLI. Aurora DB instances are preconfigured with parameters and settings appropriate for the DB instance class you have selected. You can launch a DB instance and connect your application within minutes without additional configuration. DB parameter groups provide granular control and fine-tuning of your database.

Aurora provides Amazon CloudWatch metrics for your DB instances at no additional charge. You can use the AWS Management Console to view over 20 key operational metrics for your database instances, including compute, memory, storage, query throughput, cache hit ratio, and active connections. In addition, you can use Enhanced Monitoring to gather metrics from the operating system instance that your database runs on. You can use Amazon RDS Performance Insights, a database monitoring tool that makes it easy to detect database performance problems and take corrective action with an easy-to-understand dashboard that visualizes database load. Finally, you also can use Amazon DevOps Guru for RDS to easily detect performance issues, automatically identify the root cause of performance issues, and get intelligent recommendations to help address the issue without needing help from database experts.

Amazon RDS Blue/Green Deployments allow you to make safer, simpler, and faster database updates with zero data loss on Aurora MySQL-Compatible Edition and Aurora PostgreSQL-Compatible Edition. In a few steps, Blue/Green Deployments creates a staging environment that mirrors the production environment and keeps the two environments in sync using logical replication. You can make changes—such as major/minor version upgrades, schema modifications, and parameter setting changes—without impacting your production workload.

When promoting your staging environment, Blue/Green Deployments blocks writes to both the blue and green environments until switchover is complete. Blue/Green Deployments uses built-in switchover guardrails that time out promotion if it exceeds your maximum tolerable downtime, detects replication errors, checks instance health, and more.

Aurora will keep your database up-to-date with the latest patches. You can control if and when your instance is patched through DB Engine Version Management. Aurora uses zero-downtime patching when possible: if a suitable time window appears, the instance is updated in place, application sessions are preserved and the database engine restarts while the patch is in progress, leading to only a transient (five-second or so) drop in throughput.

Aurora can notify you by email or SMS of important database events such as an automated failover. You can use the AWS Management Console or the Amazon RDS APIs to subscribe to over 40 different DB events associated with your Aurora databases.

Aurora supports quick, efficient cloning operations, where entire multi-terabyte database clusters can be cloned in minutes. Cloning is useful for a number of purposes including application development, testing, database updates, and running analytical queries. Immediate availability of data can significantly accelerate your software development and upgrade projects, and make analytics more accurate.

You can clone an Aurora database in only a few steps, and you don't incur any storage charges, except if you use additional space to store data changes.

You can manually stop and start an Aurora database in only a few steps. This makes it easy and affordable to use Aurora for development and test purposes, where the database is not required to be running all of the time. Stopping your database doesn't delete your data. See the start/stop documentation for more details.

Zero-ETL integrations

Amazon Aurora zero-ETL integration with Amazon Redshift enables near real-time analytics and ML using Amazon Redshift on petabytes of transactional data from Aurora by removing the need for you to build and maintain complex data pipelines that perform extract, transform, and load (ETL) operations. Transactional data is automatically and continuously replicated within seconds of being written in Aurora and is seamlessly made available in Amazon Redshift.

Once data is available in Amazon Redshift, you can start analyzing it immediately and apply advanced features like data sharing, materialized views, and Amazon Redshift ML to get holistic and predictive insights. You can consolidate multiple tables from various Aurora database clusters and replicate your data into one Amazon Redshift data warehouse to run unified analytics across multiple applications and data sources. When using both Aurora Serverless and Amazon Redshift Serverless, you can generate near real-time analytics on transactional data without having to manage any infrastructure for data pipelines. Read our documentation on working with Aurora zero-ETL integrations with Amazon Redshift.

Generative AI

Aurora offers capabilities to enable machine learning (ML) and generative artificial intelligence (AI) models to work with data stored in Aurora in real-time and without moving the data. With Amazon Aurora PostgreSQL-Compatible Edition, you can access vector database capabilities to store, search, index, and query ML embeddings with the pgvector extension.

A vector embedding is a numerical representation that represents the semantic meaning of content such as text, images, and video. Generative AI and other AI/ML systems use embeddings to capture the semantic meaning of this content input into a large language model (LLM). You can store embeddings from ML and AI models, such as those from Amazon Bedrock and Amazon SageMaker in your Aurora databases. Read our documentation on extensions versions for Aurora PostgreSQL.

Amazon Aurora PostgreSQL is available as a Knowledge Base for Amazon Bedrock to connect your organization’s private data sources to foundation models (FM) and enable automated Retrieval-Augmented Generation (RAG) workflows on them. This makes your FMs more knowledgeable about your specific domain and organization. Read our documentation on how to use Aurora PostgreSQL as a Knowledge Base for Amazon Bedrock.

Aurora machine learning (Aurora ML) also simplifies adding generative AI model predictions to your Aurora database. Aurora ML exposes ML models as SQL functions, allowing you to use standard SQL to call ML models, pass data to them, and return predictions, text summaries, or sentiment as query results. With Aurora ML, you can make the process of adding new embeddings to your Aurora PostgreSQL database with the pgvector extension real-time via periodic calls to a SageMaker or Amazon Bedrock model, which returns the latest, up-to-date embeddings.

Migration support

Standard MySQL import and export tools work with Aurora. You can also easily create a new Aurora database from an Amazon RDS for MySQL DB snapshot. Migration operations based on DB snapshots typically complete in under an hour, but will vary based on the amount and format of data being migrated.

Alternatively, AWS Database Migration Service (AWS DMS) offers built-in native tooling from within the DMS Console for a seamless migration. With no replication instances to provision or scale, you can initiate a database migration with a few simple clicks, and only pay on an hourly basis for the time used.

You can also set up binlog-based replication between an Aurora MySQL-Compatible Edition database and an external MySQL database running inside or outside of AWS.

Standard PostgreSQL import and export tools work with Aurora, including pg_dump and pg_restore. Aurora also supports snapshot import from Amazon RDS for PostgreSQL, and replication with AWS Database Migration Service (AWS DMS).

Aurora provides an ideal environment for moving database workloads off of commercial databases. Aurora has functional capabilities which are a close match to those of commercial database engines, and delivers the enterprise-grade performance, durability, and high availability required by most enterprise database workloads. AWS Database Migration Service (AWS DMS) can help accelerate database migrations to Aurora with managed features like DMS Schema Conversion and DMS Serverless. DMS Schema Conversion will automatically assess and convert schemas and source objects to be compatible with the target Aurora cluster. Meanwhile, DMS Serverless automates provisioning, monitoring, and scaling of migration resources.

Babelfish for Aurora PostgreSQL is a new capability for Aurora PostgreSQL-Compatible Edition that enables Aurora to understand commands from applications written for Microsoft SQL Server. With Babelfish, Aurora PostgreSQL now understands T-SQL, Microsoft SQL Server's proprietary SQL dialect, and supports the same communications protocol, so your apps that were originally written for SQL Server can now work with Aurora with fewer code changes. As a result, the effort required to modify and move applications running on SQL Server 2005 or newer to Aurora is reduced, leading to faster, lower-risk, and more cost-effective migrations. Babelfish is a built-in capability of Aurora, and does not have an additional cost. You can enable Babelfish on your Aurora cluster in only a few steps in the RDS console.

Developer productivity

Trusted Language Extensions (TLE) for PostgreSQL is a development kit and open-source project that allows you to quickly build high performance extensions and safely run them on Amazon Aurora without needing AWS to certify code. Developers can use popular trusted languages—like JavaScript, PL/pgSQL, Perl, and SQL—to safely write extensions. TLE is designed to prevent access to unsafe resources and limits extension defects to a single database connection. DBAs have fine-grained, online control over who can install extensions and can create a permissions model for running them. TLE is available to Aurora customers at no additional cost.

Aurora offers machine learning capabilities directly from the database, enabling you to add ML-based predictions to your applications through the familiar SQL programming language. With a simple, optimized, and secure integration between Aurora and AWS machine learning services, you have access to a wide selection of ML algorithms without having to build custom integrations or move data around. Learn more about Aurora machine learning.

Aurora works in conjunction with Amazon RDS Proxy, a fully managed, highly available database proxy that makes applications more scalable, more resilient to database failures, and more secure. RDS Proxy allows applications to pool and share connections established with the database, improving database efficiency and application scalability. It reduces failover times by automatically connecting to a new database instance while preserving application connections. It enhances security through integrations with AWS IAM and AWS Secrets Manager.

Data API is an easy-to-use, secure HTTPS API for executing SQL queries against Aurora databases that accelerates modern application development. Data API eliminates the network and application configuration tasks needed to securely connect to an Aurora database, which makes accessing Aurora as simple as making an API call. Data API eliminates the use of database drivers and client-side connection pooling software. It also improves application scalability by automatically pooling and sharing database connections. Data API enhances security through integrations with AWS IAM and AWS Secrets Manager.

Developers can call Data API via applications built with an AWS SDK. Data API also provides access to Aurora databases for AWS AppSync GraphQL APIs.