Using the Heimdall Proxy to Split Reads and Writes for Amazon Aurora and Amazon RDS
By Jatin Singh, Partner Solutions Architect – AWS
By Erik Brandsberg, CTO – Heimdall Data
Horizontally scaling your SQL database involves separating the write-master from read-only servers. This allows the write server to perform dedicated write operations rather than processing redundant read queries.
While simple to configure on the database, application changes are required to route queries to the optimal database instance. Writing to one instance and reading from another can also result in inconsistent data due to synchronization delays.
Heimdall Data offers a database proxy to help developers, database administrators, and architects solve these challenges for Amazon Relational Database Service (Amazon RDS) and Amazon Aurora without any application changes. Customers will save months of development and maintenance at the data access layer.
In this post, we will cover how to automate routing queries read and write instances (aka read/write split), and query caching while maintaining strong data consistency.
The Heimdall Proxy is a transparent data access platform that intelligently routes queries to the most optimal data source, resulting in SQL offload for improved scale.
Amazon ElastiCache for Redis is used to cache SQL results and track SQL queries so they are routed to the appropriate database node for fresh data. This joint solution is particularly useful for Postgres, MySQL, and SQL Server users.
Without application code changes, the Heimdall Proxy routes queries to write servers and read replicas to maximize the scale and performance of Amazon RDS.
Figure 1 – Read/write split deployment.
Let’s go through the benefits of using Heimdall’s read/write split features:
Strong Data Consistency
One of the challenges of splitting queries is the lag time between when data was written and when the read replicas are updated. There are solutions that claim read/write splitting, but they are not replication lag aware. They may track lag but are not intelligent enough to determine when it’s safe (to access the reader) for a particular query to be routed. Hence, they do not support ACID (atomicity, consistency, isolation, durability) compliance.
The Heimdall Proxy calculates the time it took for writes to update on the reader, at a table level. This allows the proxy to intelligently determine if the replica(s) has the most up-to-date data.
Efficient Load Balancing
When applications create many connections to the database at connection pool initialization, the domain name system (DNS) cache’s “stickiness” resolves to the same replica, leaving the other readers unused. This results in inefficient RDS scale and wasted expenses.
The Heimdall Proxy does not rely on DNS and evenly distributes the load and performs automated failover, optimizing the use of your RDS instances.
Intelligent Routing for Low Latency
The optimal RDS instance is selected based on the lowest latency to create local-like response times. This is particularly beneficial in a global database deployment. Check out Heimdall’s AWS blog post on the Amazon Aurora Global Database.
Automated Database Reconfiguration
By monitoring the status of Amazon RDS instances via the RDS API, queries are routed to the appropriate instance even during cluster changes. Scenarios include:
- Database instances are promoted to a primary writer.
- Read replicas are added or removed from the cluster.
This is done automatically, without application changes.
AWS Installation and Setup
Figure 2 – AWS configuration wizard.
On the Heimdall Central Console, the AWS Configuration Wizard takes you step-by-step to connect the proxy to your AWS services (application, RDS, ElastiCache) and configure features (read/write split, caching, load balancing).
Read/Write Splitting Configuration
After completing the wizard, the configurations will pre-populate. The Data Sources tab will appear like below, with one read/write instance configured along with at least one read-only instance.
Figure 3 – Data Sources tab on the central console.
The proxy-calculated replication lag was 2ms (Figure 4 below), In Figure 3, the user configured an additional lag time of 10000ms to ensure data consistency. This is an optional setting. Hence, the total the effective replication lag was 2ms + 10000ms = 10002ms.
If the total effective replication lag time was beyond when a table was last written to, the primary writer instance would be used for reads. Otherwise, the proxy would access the read-only replica as it would be safe to access.
Figure 4 – Status tab on the Heimdall Central Console.
Finally, in order to select what queries should be eligible for reading from a read replica, ensure a reader eligible rule is configured in the Rules tab, as shown in Figure 5.
Figure 5 – Rules tab on the central console.
In cases where the replication lag is not a concern, users can create a read/write split rule for particular queries that should unconditionally be read from the read replica using the “lagIgnore” parameter. Cases where this may be useful are for reporting users, so they can generate reports without impacting the primary write node.
Real-Time Analytics Charts
On the Analytics tab of the Heimdall Central Console, the dashboard below shows that read/write splitting is now occurring, re-routing read traffic from write master to the read replicas. The majority of traffic is now routed to the read replicas. It works!
Figure 6 – Dashboard tab on the central console.
Additional validation that read/write splitting is working can be seen in the log tab in Figure 7, where the actual source of each query can be found. In this case, the reads are directed to the node “Magento-Demo-source-Reader,” while the update is performed on “Magento-Demo-source-Master.”
Figure 7 – Log tab on the Heimdall Central Console.
Automated Query Caching
The Heimdall Proxy provides the query caching logic to improve database scale by offloading SQL traffic. In this scenario, Amazon ElastiCache is a look-aside, SQL results cache.
The proxy intelligently determines which queries to cache and automatically invalidates when there’s an update to the database. Alternatively, users can include or exclude which queries to cache by simply creating policies.
Figure 8 – Heimdall Data auto caching for Amazon ElastiCache architecture.
The Heimdall query caching solution allows the user to choose cache store: 1) Local heap, or 2) Data grid (for example, Amazon ElastiCache for Redis). Just point the proxy to the designated cache store, and the proxy will cache and invalidate SQL results for optimal SQL offload.
The best part of this proxy caching solution is its transparency, not requiring any code changes. Caching and invalidation are automated.
For more information on how to configure Heimdall for ElastiCache caching, check out the AWS blog post on Automated Query Caching for Amazon RDS, Aurora, and Redshift.
To take full advantage of the performance and scalability of Amazon RDS, application owners must properly interface with the database. This often requires code changes. The Heimdall Database Proxy supports read/write splitting, query caching, and advanced connection pooling so you get the best out of RDS.
Check out a customer success story on the AWS blog about how Heimdall’s Database Proxy Improves Website Response Times with No Code Changes.
Deployment of Heimdall requires zero application changes, saving months of deployment and maintenance. To get started, you can download a free trial on AWS Marketplace, or contact Heimdall Data at email@example.com.
- Amazon Aurora Global Database for Low Latency
- Advanced Connection Pooling with the Heimdall Proxy
- Automated Query Caching for Amazon RDS, Aurora, and Redshift
- Heimdall Proxy for Amazon Redshift Overview
- Heimdall Proxy Technical Documentation
Heimdall Data – AWS Partner Spotlight
Heimdall Data is an AWS Partner offering a database proxy for Amazon RDS and Amazon Redshift. It is transparently deployed to improve backend performance and scale. No application changes are required.