Using the Heimdall Proxy to Split Reads and Writes for Amazon Aurora and Amazon RDS
By Erik Brandsberg, CTO at Heimdall Data
By Jatin Singh, Partner Solutions Architect at AWS
Horizontally scaling your SQL database involves separating the write-master from read-only servers.
This allows the write server to perform dedicated write operations rather than processing redundant read queries. However, writing to one node and reading from another can result in inconsistent data due to synchronization delays.
Heimdall Data offers a database proxy to help developers, database administrators, and architects achieve optimal scale from their Amazon Relational Database Service (Amazon RDS) and Amazon Aurora environment without any application changes.
With the Heimdall proxy, customers will save months of development and maintenance at the data access layer.
In this post, we will cover how to route queries to read and write instances (aka read/write split), and how to automate query caching. Heimdall Data is an AWS Partner Network (APN) Advanced Technology Partner with the AWS Data & Analytics Competency.
The Heimdall proxy is a transparent data access layer that intelligently routes queries to the most optimal data source, resulting in SQL offload and improved response times.
The Heimdall proxy leverages Amazon ElastiCache for Redis to cache SQL results and track SQL queries so they are routed to the appropriate database node for fresh data.
The proxy provides the routing and caching logic, and Amazon ElastiCache provides the storage medium. This joint solution is particularly useful for Postgres, MySQL, and SQL Server users.
Let’s walk through the read/write split and query caching features.
Without any application code changes, Heimdall routes queries to write servers and appropriate read replicas to maximize the scale and performance of Amazon RDS. This is often called read/write split, or splitting.
One of the challenges of splitting queries, however, is data synchronization. There’s a time lag between when data was written and when the read replicas are updated.
Figure 1 – Heimdall distributed deployment.
There are a variety of tools in the market that claim read/write splitting, but they are not replication lag aware. Other solutions may track lag but are not intelligent enough to determine when it’s safe for a particular query to be routed. Hence, they do not offer a complete solution.
Heimdall proxy tracks the last time a table was written. Next, Heimdall calculates how long it takes for a write to be updated on the read instance. With both pieces of information, the proxy intelligently decides which node is “safe” to read from, either the read nodes or the read/write node.
AWS Installation and Setup
Download the Heimdall proxy from AWS Marketplace. The installation will include both the proxy and central console.
Figure 2 – AWS configuration wizard.
On the central console, the AWS configuration wizard takes you step-by-step to successfully connect the Heimdall proxy to your AWS services (application, database, cache) and configure features (read/write splits, caching, load balancing).
Read/Write Splitting Configuration
After completing the wizard, the configurations should be pre-populated. The Data Sources tab should appear like below, with one read/write instance configured along with at least one read-only instance.
Figure 3 – Data Sources tab on the central console.
In Figure 3, since the replication lag window was set to be 1000ms. This value is added to the detected replication lag (2ms) from the cluster, as you can see in Figure 4 below, in order to determine the total effective replication lag, or 1002ms in this case.
If the value of these two combined is higher than the time since a table was last written to, then the primary write node will be used for reads of that table. Otherwise, it will be deemed safe to read from the read-only replica.
Figure 4 – Status tab on the central console.
Finally, in order to select what queries should be eligible for reading from a read replica, ensure a reader eligible rule is configured in the Rules tab, as shown in Figure 5.
Figure 5 – Rules tab on the central console.
In cases where the replication lag is not a concern, users can create a read/write split rule for particular queries that should unconditionally be read from the read replica using the “lagIgnore” parameter. Cases where this may be useful are for reporting users, so they can generate reports without impacting the primary write node.
Real-Time Analytics Charts
On the Analytics tab of the central console, the dashboard below shows that read/write splitting is now occurring, re-routing read traffic from write master to the read replicas.
The majority of traffic is now routed to the read replicas. It works!
Figure 6 – Dashboard tab on the central console.
Additional validation that read/write splitting is working can be seen in the log tab in Figure 7, where the actual source of each query can be found. In this case, the reads are directed to the node “Magento-Demo-source-Reader,” while the update is performed on “Magento-Demo-source-Master.”
Figure 7 – Log tab on the central console.
Automated Query Caching
The Heimdall proxy provides the query caching logic to improve database scale by offloading SQL traffic. In this scenario, Amazon ElastiCache is a look-aside, SQL results cache.
The proxy intelligently determines which queries to cache and automatically invalidates when there’s an update to the database. Alternatively, users can include or exclude which queries to cache by simply creating policies.
Figure 8 – Heimdall Data auto caching for Amazon ElastiCache architecture.
The Heimdall query caching solution is flexible in:
- Deployment: Caching logic is deployed at the client-side (installed on each application instance, or as a separate Amazon EC2 proxy tier between the application and database.
- Your choice of cache store: Users can choose either the local heap, data grid (e.g. Amazon ElastiCache), or a combination of both. Through the central console, point the proxy to the cache store, and the proxy will cache and invalidate SQL results for optimal SQL offload.
The best part of this proxy caching solution is its transparency, not requiring any code changes. Caching and invalidation are automated.
For more information on how to configure Heimdall for ElastiCache caching, check out this post by Heimdall on the AWS Database Blog: Automated Query Caching for Amazon RDS, Aurora, and Redshift.
To take full advantage of the performance and scalability of Amazon RDS and Aurora, application owners must properly interface with these databases. This often requires code changes. The Heimdall database proxy supports read/write splitting and query caching, allowing you get the best out of Amazon RDS and Aurora.
Questis, a financial services company, was experiencing database scaling issues. To meet production timelines, modifying their application was not an option. They chose the Heimdall proxy to intelligently offload their SQL traffic, deploying both read/write splitting and query caching. You can read about their customer success story in this APN Blog post.
Deployment of Heimdall requires zero application changes, saving months of deployment and maintenance. To get started, you can download a free trial on AWS Marketplace, or contact Heimdall Data at firstname.lastname@example.org.
You can also check out these additional resources to learn more about Heimdall:
- (Blog) Advanced Connection Pooling with the Heimdall Proxy
- (Blog) Automated Query Caching for Amazon RDS, Aurora, and Redshift
- (Blog) Heimdall Proxy for Amazon Redshift Overview
- Heimdall Data website
The content and opinions in this blog are those of the third party author and AWS is not responsible for the content or accuracy of this post.
Heimdall Data – AWS Partner Spotlight
Heimdall Data is an AWS Competency Partner and SQL database proxy for Amazon Redshift, Amazon RDS, and Amazon Aurora. It is transparently deployed to improve your read/write queries without any code changes.
*Already worked with Heimdall Data? Rate the Partner
*To review an AWS Partner, you must be a customer that has worked with them directly on a project.