AWS Database Blog
Optimize cost and boost performance of RDS for MySQL using Amazon ElastiCache for Redis
Customers often face the challenge of optimizing the cost of their database environments, while having to improve their application performance and response times, as both their data volumes and user base grow. Internet-scale applications that have large volumes of data and high volumes of throughput need underlying data architectures that can support microsecond latencies. Improving application performance enables our customers to better serve their external and internal customers through better response times. In-memory caching of query results helps in boosting application performance while providing customers the ability to grow their business and expand their market footprint cost effectively.
Adding a distributed result-set cache to a database is a common way to accelerate the application performance and reduce costs. Grab, Wiz, DBS Bank are among many customers that use Amazon ElastiCache for Redis in conjunction with their primary databases to cost effectively support their real-time application performance needs. It is a fully managed, Redis compatible service delivering real-time, cost-optimized performance for modern applications. ElastiCache scales to hundreds of millions of operations per second with microsecond response time, and offers enterprise-grade security and reliability. Customers use ElastiCache to accelerate application and database performance, or as a primary datastore for use cases that don’t require data durability, for example session stores, gaming leaderboards, streaming, data analytics, and feature stores for serving ML inference. In this post, we explain how you can optimize your relational database costs with in-memory caching using Amazon ElastiCache. The data presented here is based on the benchmarking tests we ran on Amazon Relational Database Service (Amazon RDS) for MySQL version 8.0.28 on instance type db.r6g.xlarge. The benchmarking tests cache the query results in Amazon ElastiCache.
Amazon ElastiCache for Redis
Amazon ElastiCache is the native purpose-built caching service offered by AWS. ElastiCache complements primary databases, optimizes overall performance at a fraction of the cost, and allows for fast scaling.
ElastiCache is a fully managed AWS service that is:
- Extremely fast – as an in-memory cache with sub-millisecond response time.
- Scalable both vertically and horizontally, without disruption to meet workload needs.
- Fully managed hardware and software management, including minor compute engine version upgrades. It is highly available with multi-AZ capability, including auto failover and auto instance recovery in the event of a service disruption. It has autoscaling capabilities for variable workloads, and support for data tiering and reserved instance types.
- Compatible with open source Redis.
Cost optimization of Amazon RDS for MySQL with Amazon ElastiCache
When you implement ElastiCache, a caching solution, over a relational database such as Oracle, SQL Server, or Amazon RDS for MySQL, you can improve performance of your applications and reduce costs. You can “save up to 55% in cost and gain up to 80x faster read performance using ElastiCache with RDS for MySQL (vs. RDS for MySQL alone).”
ElastiCache can cache your result set and offload database IOPS to lower costs, and improve performance of both the database and the application. Adding a caching layer on top of your primary databases is more cost-effective than scaling database instance capacity. One ElastiCache node can process over 250,000 requests per second. Read-heavy workloads with common database queries returning same result set, will benefit greatly by caching the query results. Not all database workloads will benefit from adding a caching service. Write-heavy databases, with most of the transactions being inserts or updates, are not good candidates. Applications that need database-level processing (using stored procedures and cascading updates with triggers) will also not benefit from caching. ElastiCache will benefit applications that
- Need to handle massive volumes of throughput,
- Are spiky (increased peak traffic in short intervals),
- Process and consolidate large volumes of data in memory and in real-time before the database updates, and
- Need to support instantaneous user response times.
ElastiCache + primary data sources
You can use ElastiCache with relational databases and self-managed engines such as MySQL, Oracle, PostgreSQL, SQL Server; NoSQL databases such as Amazon DynamoDB or Amazon DocumentDB (with MongoDB compatibility); with Amazon Simple Storage Service (Amazon S3); or no database tier at all, which is common for distributed computing applications.
Reduce read replica footprint and save costs with ElastiCache
In the following illustration, we replaced RDS for MySQL read replicas and provided the read capacity from the ElastiCache cluster. Adding a fully distributed ElastiCache cluster is less expensive than adding read replicas, which provides comparable levels of read capacity at a lower cost with better performance. Cache provides dedicated memory, network, and CPU that are utilized to provide significantly lower latency and much higher throughput. We need to keep in mind that we’re not replicating 100% of our database data in cache. Only the results of queries need to be cached, eliminating the need to fully replicate the database.
Caching – Application Implementation strategies
You can implement the following strategies in your application to cache the data.
Lazy load caching
Lazy load caching strategy, also called lazy population or cache-aside caching strategy, is the most prevalent form of caching strategy used by customers. The basic idea is to populate the cache only when an object is actually requested by the application. The overall application flow goes like this:
- Your app receives a query for data, for example the top 10 most recent news stories.
- Your app checks the cache to see if the object is in cache. If so (a cache hit), the cached object is returned, and the call flow ends.
- If not (a cache miss), then the database is queried for the object.
- The cache is populated, and the object is returned.
Write-through caching
For workloads that need consistency, Write-through strategy can be used to refresh the cache when the data in the source datastore is mutated. Write-through either requires a mechanism to detect changes and update the cache on source datastore updates, or needs a dual-write process to both the cache and the source when data changes. Client applications implementing a Write-through will update the database, and on a successful write, reads the data back from the database and caches the new query results in the cache synchronously. Customers commonly implement write-through with a Time-To-Live (TTL) expiration in their client application, to keep their data.
Applications invoke the ElastiCache caching layer via a Redis client API. While lazy-load strategy helps in loading the data to the cache, its primary purpose is to read the data from the cache, when it exists, avoiding making trips to the primary data source. While write-thru strategy helps in keeping the data in sync with the database, customers generally implement both Lazy loading and write-through strategies to read from the cache when data exists and write to the cache to keep the data refreshed. For a detailed explanation of caching strategies for ElastiCache for Redis.
RDS for MySQL + ElastiCache – better together example
80:20 – Read:Write Ratio, when only 80% of data read is cached
Now that we looked at the performance advantages, cost benefits and application implementation strategies for caching, let’s review an example of cost savings and performance improvement. Let’s say you need to achieve 30,000 queries per second (QPS) throughput. To meet this requirement, one approach is to use RDS for MySQL with read replicas, and another is to use RDS + ElastiCache. When we use RDS alone without ElastiCache, we need 4 RDS read replicas to support the throughput requirement and that would cost us $1740/month (for db.r6g.xlarge). When using ElastiCache with RDS, you can eliminate the read replicas and instead use 1 ElastiCache node and its read replica node to achieve the same throughput. Eliminating RDS read replicas and implementing ElastiCache costs $780/month. The cost of ElastiCache with RDS results in reduction of the cost by 55% when compared to using RDS alone with their read replicas. The following table gives you the details of the nodes used and their cost.
RDS for MySQL configuration and database size:
RDS data set: size approximately 80GB with 300GB allocated storage.
RDS Node Configuration: MySQL version 8.0.28 on instance type db.r6g.xlarge
Example select statement used in the test:
Metrics | RDS primary instance only | RDS with one read replica | RDS with 4 read replicas | ElastiCache with 1 read replica |
Avg resp time | 200ms | 80ms | 80ms | 1ms |
Avg read QPS | 8000 QPS | 16000 QPS | 30000 QPS | 32,000 QPS |
Pricing $/month | $348 /month | $696 / month | $1740 / month | $780 / month |
Nodes used: | 1 read/write primary db.r6g.xlarge | 1 writer 1 readers = 2x db.r6g.xlarge | 1 writer 4 readers = 5x db.r6g.xlarge | 1x db.r6g.xlarge + 2x cache.m6g.xlarge |
Higher savings for high throughput – 100% cached data leads to more savings
When you implement caching, the higher the throughput requirements for your application, the higher are the cost savings. In the following example, when the cache is fully warmed up (which means all of the data that is read, is served by ElastiCache), the read capacity increases to 250,000 QPS – with RDS alone, supporting this throughput will cost 87% more. By implementing caching for high throughput read-heavy applications, you can get significant cost savings.
Metrics | 1 RDS + 9 read replicas | 1 ElastiCache + 1 EC read replica (on 1 RDS + 1 read replica) |
Avg resp time | 80ms | 9ms |
Avg QPS | 250,000 | 250,000 |
Pricing | $7,840/ month | $784 (RDS) + $432 (EC)/ month |
Nodes used: | 10 x db.r6g.xlarge | 2 x db.r6g.xlarge + 2 x cache.m6g.xlarge |
Conclusion
Cost savings achieved by implementing ElastiCache with primary relational databases such as RDS are directly proportional to the read throughput needed by your application. The higher the read throughput needs , the higher are the cost savings. This is because as throughput needs increase, scaling relational databases becomes more expensive. On the other hand, each ElastiCache node can support a throughput of up to 400,000 Queries Per Second. By adding ElastiCache to your application architecture, you can gain performance and reduce costs.
To get started with Amazon ElastiCache, please refer to our Getting Started Guide or Getting started self-paced learning course or our on-demand learning path. You can also find additional resources and learn more by visiting Amazon ElastiCache product page. For more prescriptive guidance, please reach out to your AWS Account team to schedule a deep dive session with an Amazon ElastiCache specialist.
You can also download sample code that implements caching strategies explained in this post from our Github repository.
We have adapted the concepts from this post into a deployable solution, now available as Guidance for Active-Active Replication on Amazon RDS for MySQL in the AWS Solutions Library. To get started, review the architecture diagrams and the corresponding AWS Well-Architected framework, then deploy the sample code to implement the Guidance into your workloads.
We have adapted the concepts from this post into a deployable solution, now available as Guidance for Optimizing Cost of Amazon RDS for MySQL in the AWS Solutions Library. To get started, review the architecture diagrams and the corresponding AWS Well-Architected framework, then deploy the sample code to implement the Guidance into your workloads.
About the authors
Sashi Varanasi is a Global Leader for Specialist Solutions architecture, In-Memory and Blockchain Data services. She has 25+ years of IT Industry experience and has been with AWS since 2019. Prior to AWS, she worked in Product Engineering and Enterprise Architecture leadership roles at various companies including Sabre Corp, Kemper Insurance and Motorola.
Steven Hancz is an AWS Sr. Solutions Architect specialized in in-memory NoSQL databases based in Tampa, FL. He has 25+ years of IT experience working with large clients in regulated industries. Specializing in high-availability and performance tuning.
Roberto Luna Rojas is an AWS Sr. Solutions Architect specialized in in-memory NoSQL databases based in NY. He works with customers from around the globe to get the most out of in-memory databases one bit at the time. When not in front of the computer, he loves spending time with his family, listening to music, and watching movies, TV shows, and sports.