AWS for Industries

Amazon Aurora for Core Banking Systems

A Core Banking system is, as the name suggests, the very core of a bank. It is the back-end system that processes banking transactions, posts updates to accounts and other financial records and often holds critical customer information. It typically includes deposit, loan and credit processing capabilities, with interfaces to general ledger and reporting systems. Because banks have the responsibility of safeguarding their customers’ assets, core banking systems must be engineered with utmost accuracy, reliability and data integrity.

In the past, these goals have been achieved with combinations of complex infrastructure and software, which leads to a high cost of ownership for banks to serve their customers. More recently, banks and other financial institutions are looking to reduce these costs wherever possible without compromising on availability, resilience and performance. One of the key contributors to the cost of core banking systems has been the commercial database engines that many of them depended on; Open Source alternatives not considered to have the required capabilities to replace them.

However, in the 10 years since Amazon Aurora was released, we’ve seen a fundamental shift with AWS customers in Financial Services, with customers publicly sharing their experiences running critical systems on Amazon Aurora. This includes some of the most heavily regulated entities like FINRA, Dow Jones, Capital One, Standard Chartered Bank, Fannie Mae and Goldman Sachs. Amazon Aurora is also the database engine of choice for many core banking software vendors, including 10x Banking and Thought Machine.

Financial services customers using Amazon Aurora are finding the most benefit from the combination of the PostgreSQL community alongside Amazon’s own Aurora enhancements. PostgreSQL is a feature-rich open source relational database backed by more than 20 years of community development, first starting in 1985. This has led to a very mature and structured development model for the core database engine, with predictable release cycles, an essential feature for enterprise customers. Additionally, AWS’s contributions help align Aurora’s development with the broader PostgreSQL community’s best practices and standards, ensuring compatibility and stability.

In this blog, we will dive into key reasons why these large financial services customers and vendors choose Amazon Aurora for their most critical workload:

1. Resiliency and Availability
2. Performance and Scalability
3. Operational Benefits

By the end of this article, you’ll understand how regulated customers in the financial services industry are benefiting from Amazon Aurora’s combination of the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source.

1. Resiliency & Availability

Distributed storage optimizes resiliency

Any unplanned service interruption of a bank’s core banking systems is catastrophic. Even brief periods of unavailability can have severe consequences, disrupting the bank’s ability to serve customers. Prolonged outages can erode customer trust and incur financial penalties from regulators.

Aurora greatly improves database resilience thanks to its optimized storage architecture. Unlike traditional databases, Aurora utilizes log-structured distributed storage. Conceptually, the storage engine is a distributed Storage Area Network that spans multiple AWS Availability Zones (AZs) within an AWS Region.

Figure 1: Amazon Aurora’s distributed storage architecture

Data is stored in 10 GB logical blocks, called “protection groups”. The data in each protection group is simultaneously written to six Storage Nodes across three Availability Zones (AZs), with two in each AZ. A write is considered successful if at least four of the six storage nodes acknowledge receipt. This architecture makes the storage highly fault-tolerant with Aurora transparently handling the loss of up to 2 copies of data without affecting database write availability and up to 3 copies can be lost without affecting read availability.

Aurora storage implements self-healing mechanisms to ensure data integrity and availability. The storage layer continuously monitors and repairs nodes and disks, proactively identifying and resolving errors and performance hotspots. Aurora 10 GB “protection groups” are the fundamental unit for repairs and rebalancing. This process is facilitated by a peer-to-peer gossip protocol, which allows storage nodes to coordinate repairs efficiently, drawing from other copies as needed. Additionally, Aurora’s storage nodes perform continuous backups to Amazon S3, providing an extra layer of data protection. These self-healing features collectively contribute to a resilient and reliable storage infrastructure, which is essential for maintaining the integrity and availability of core banking data.

The built-in redundancy and self-healing mechanisms provide exceptional resilience, making Aurora a compelling choice for running mission-critical core banking applications.

Amazon Aurora’s decoupled storage and compute improves recovery times

Aurora’s shared storage layer ensures a Recovery Point Objective (RPO) of zero across the Availability Zones of an AWS Region, but the decoupling of storage and compute has additional benefits; Database instances are available quickly after restart or failover as they do not have to go through the traditional data recovery process of replaying the transactions contained in the redo log. Aurora delegates recovery to the Aurora Storage layer, which only has to determine the point where the storage volume is transactionally consistent.

However, in the case of an Aurora cluster with a single instance, time is still needed to provision a new instance on failure, resulting in a longer Recovery Time Objective (RTO). To reduce this to seconds, customers can deploy Aurora replicas in additional AWS Availability Zones; replicas are read-only instances that share the same underlying storage as the primary instance. In the event of an issue being detected in the primary instance, Aurora will automatically promote a healthy replica to be the new primary. The failover process, including the CNAME record change, typically completes within 30 seconds. Additionally, Amazon RDS Proxy can be used to preserve application connections, making failovers transparent to applications and reducing failover time by 66%.

Historically, following a restart, database performance is impacted as the data is loaded from slow disks into fast in-memory caches but with the Aurora “cluster cache management” feature, financial services customers can ensure that full database performance is available after a failover. This feature keeps the buffer cache synchronized between the primary instance and designated failover replicas, avoiding the performance degradation caused by cold caches.

Amazon Aurora Global Database enables cross-region Disaster Recovery

Due to the critical importance of core banking workloads, especially large and systemically important financial services customers often aspire to architect a disaster recovery solution spanning multiple AWS Regions. To do so, the underlying database must be replicated between regions quickly and reliably. Historically, customers have been hit by the financial cost imposed by commercial database engines through additional licences of instances and replication features and the operational cost of implementing and managing replication.

Amazon Aurora Global Database closes this gap by offering customers a managed service which simplifies the creation and management of cross-region disaster recovery for their databases. Global replication can be added on creation of new database clusters, or simply added to existing clusters, with a maximum of 5 secondary regions currently supported.

Aurora Blog Post - Diagram multi-region

Figure 2: Amazon Aurora Global Database architecture

Aurora Global Databases replicates data to secondary regions with no performance impact to the database writer instance, because cross-region replication is delegated to the Aurora Storage layer. It enables fast local reads, and supports disaster recovery from Region-wide outages. If your primary Region suffers a performance degradation or outage, you can promote one of the secondary Regions to take read/write responsibilities. An Aurora cluster can failover and become available in the DR region in minutes, even in the event of a complete Regional outage, because the read replicas and storage are already available. This provides your application with an effective recovery point objective (RPO) of seconds and a recovery time objective (RTO) of minutes, providing a strong foundation for a global business continuity plan.

You can also choose different configuration options for your database clusters in your primary and secondary Regions, to suit your availability and cost requirements. For example, the remote region can be configured with fewer, or smaller, replica instances than the primary and resized as part of your recovery procedures. To reduce cost even further, it is possible to run with no read replica instances in the secondary region, or with an Amazon Aurora Serverless instance at minimum capacity. These choices may incur a higher RTO during a fail-over, as the database instances in the secondary region will need to be provisioned or scale up as workloads are transferred.

Standard Chartered Bank leverages Aurora Global Database for the hosting of its core banking system

Because of these features Aurora Global Databases are a key enabler for the Cross Region Disaster Recovery strategy for banks, with an example being Standard Chartered Bank.

In 2020, the bank started to re-architect its main core banking system using Aurora. Given their cross-region DR requirements of RPO of 15 minutes and RTO of up to 24 hours, Standard Chartered Bank decided a “pilot-light” strategy was the best approach. In a pilot-light model, the database is continuously replicated to the secondary region, but the rest of the services are down until required. When a disaster occurs, the remaining part of the infrastructure is deployed, optimising costs in the normal state. With the native replication features of Aurora, Standard Chartered’s Core Banking System is able to have a cross-region RPO of seconds whilst adding less than 10% additional infrastructure cost.

Figure 3Figure 3: Standard Chartered’s Pilot Light cross-region architecture

After migrating its first 7 markets Standard Chartered saw an increase in both resilience and performance to 4,000 TPS (Transactions Per Second), 10 times their previous throughput, combined with significant cost reduction. As of today, Standard Chartered has migrated its core banking system across 26 global markets to AWS with Amazon Aurora PostgreSQL.

2. Performance and Scalability

With the shift towards a more digital economy, adoption of real-time payments rails and increase of digital payments such as cards, wallets and peer-to-peer payments, core banking systems need increasingly to be able to efficiently handle a high volume of transactions with minimal latency during peak and regular hours. Aurora delivers this thanks to its innovative storage architecture and its scalability.

Optimized IO operations

Beyond resilience, Aurora’s storage architecture also delivers significant performance improvements for Core Banking systems. Its architecture decouples the database buffer cache from the storage volume, both managed by the Aurora service. This separation allows the database engine to send the log records of the changes to the storage service asynchronously, without having to write the full page to storage. This log-structured storage approach dramatically reduces the I/O operation overhead compared to traditional databases, that write entire database pages.

As we explained earlier in this blog, Aurora uses a quorum model for writes, where a write operation is considered successful when at least four out of the six storage nodes acknowledge. This approach allows Aurora to maintain write performance whilst delivering higher data availability than traditional point-to-point replication, as Aurora does not need to wait for all acknowledgments to commit a transaction.

As a result, Aurora can achieve higher transaction throughput with low latency for both read and write operations, a key requirement for Core Banking systems.

Trust Bank achieves higher performance

Trust Bank, backed by Standard Chartered Bank and the FairPrice Group, was launched in September 2022 and became the world’s fastest-growing digital bank by market share in year one, onboarding more than 600,000 customers (12% of the Singapore market).

The Bank migrated its Thought Machine Core Banking System from RDS PostgreSQL to Aurora PostgreSQL in June 2023 in order to leverage both the resilience and performance features available in Amazon Aurora. After the migration, the Core Banking database was delivering 10 times higher maximum IOPS. This IO performance improvement and the quorum write model directly impacted latency sensitive operations: for example, the maximum p99 latency for payment processing was reduced by a factor of 3.

Vertical scalability and read-replicas

Scaling your Aurora database vertically can be easily done thanks to the large choice of instance families and sizes supported: from the smallest ones (2 vCPU, 4 GiB memory) to the largest (128 vCPU, 1024 GiB memory). It is possible to minimize downtime during a vertical scaling by creating an Aurora read replica with the required sizing and promoting it as the new primary instance.

For read operations, horizontal scaling can be achieved using Aurora Read-replicas. Aurora spreads the load for read-only connections across as many Aurora Replicas as you have in the cluster. An Aurora DB cluster can contain up to 15 Aurora Replicas, further enhancing read throughput and reducing read latencies.

3. Operational Benefits – Built on Open Source with Enterprise Support

AWS is a major contributor of the PostgreSQL project

AWS has been part of the PostgreSQL Open Source community since version 14, and by version 16 AWS was involved in 19% of the project changes, the 2nd largest contributor to this release. AWS collaborates with the other contributors across many areas including logical replication, performance improvements and detecting and fixing bugs. This collaboration allows AWS to implement bug fixes, security patches, and new features more rapidly and effectively, ensuring Aurora users benefit from the latest advancements in PostgreSQL.

Minimizing planned downtimes and upgrade risks

Core Banking systems databases must stay up to date to minimize security risks. However, each upgrade can be an operational challenge and minimizing downtime of the database engine is critical.

Aurora offers zero-downtime patching (ZDP) for updates which aims to reduce the downtime and disruption of these patches to the minimum through two key approaches; the patching process attempts to preserve application connection state to the database through the required restart, and waits for a quiet period of activity to start. The combination of these, and an application with robust retry logic, can result in minor version upgrades and database engine patches running in seconds with only a temporary drop in throughput during quiet periods.

For major version upgrades, Aurora supports automatic Blue/Green Deployments, where an entirely new database environment (green) is created, synchronized with the production (blue) environment using logical replication, then updated before switching traffic from the old environment. This approach minimizes risk and downtime, as the new environment can be thoroughly tested before the final cutover is started.

Trust Bank is now using blue/green deployments to achieve near zero downtime major upgrades. Before, the same upgrades used to require around 45 to 60 minutes of downtime.

PostgreSQL compatibility ensures portability

Financial Services customers try to minimize vendor lock-in as it can be costly and limiting. They also often have regulatory requirements to consider a long-term exit plan for their critical workloads from its hosting environment. Aurora PostgreSQL maintains 100% compatibility with the application layers in PostgreSQL Open Source community, which means that customers can leverage the optimized and fully-managed Aurora database in AWS, while being sure they can also make use of other PostgreSQL distributions including on-premises or with other Cloud providers. As a result of this compatibility, customers can architect combinations of PostgreSQL and Aurora that best suit their needs, opening up hybrid scenarios to meet a variety of deployment models and solve regulatory data residency challenges.

Significant cost reduction compared to commercial databases

Aurora eliminates the need for costly commercial database licenses; reducing overall database expenses by a factor of up to 10. Its pay-as-you-go pricing model also ensures that you only pay for the resources you actually use, avoiding the upfront capital expenditure and commitment associated with commercial databases. This model also allows for easy scaling, with a wide variety of instance types and sizes or by using Aurora Serverless. You can adjust capacity based on demand without incurring penalties or requiring long-term commitments.

Enterprise Support for critical workloads

For Financial Services Industry (FSI) customers, having Enterprise Support for their database is essential due to the critical nature of their operations, stringent regulatory requirements, and the need for robust security and reliability. AWS Enterprise Support ensures 24/7 access to expert assistance with 15 minutes response times for critical incidents and proactive guidance on best practices. This level of support is vital for Core Banking systems, where data breaches and system failures can have severe financial and reputational consequences.

Conclusion

As financial institutions continue to modernize their mission-critical core systems, Amazon Aurora stands out as a uniquely capable database platform, combining the advantages of an ACID compliant relational database with the cloud native resilience, performance and cost effectiveness, needed to power the very heart of modern banking operations. AWS customers including Goldman Sachs, Standard Chartered Bank, and Trust Bank have all reaped the benefits of Aurora’s ability to deliver high throughput and responsiveness for their mission-critical core banking workloads, establishing the foundations of the core banking systems of the future.

The unique resilience and availability features of Aurora make it an ideal fit for the stringent uptime and data protection requirements of core banking applications. Beyond resilience, Aurora’s superior performance characteristics ensure core banking systems can handle the intense transaction volumes and low-latency profile of modern banking.

Furthermore, Aurora’s PostgreSQL compatibility and integration with the open-source community provide customers the flexibility to leverage established technologies while benefiting from the operational advantages of a fully managed database service. The significant cost savings compared to traditional commercial database engines are an added bonus, allowing banks to reduce the high infrastructure costs historically associated with running core banking systems.

To continue learning visit the Amazon Aurora Resources pages to find documentation, tutorials and guides to getting started with Aurora and find out how services like AWS Database Migration Service, with support for schema conversion and data migration from over 20 database engines, can simplify and automate any migration process.

If you have any more questions please contact an AWS Representative to find out how we can help accelerate your Aurora journey.

Xavier Loup

Xavier Loup

Xavier Loup is a Principal Solutions Architect at AWS with over 20 years of experience in the Financial Services Industry. In recent years, Xavier has worked on several successful Core Banking Systems implementations on AWS, either migrations from On-Premises or new digital banks.

James Craig

James Craig

James Craig is an Senior Partner Solutions Architect at AWS working with Financial Services Technology Partners. James works with these software providers, their customers and specialized systems integrators to enable them to make best use of AWS and innovate their solutions on the cloud. Prior to joining AWS, James held a range of software development roles across major capital markets institutions including Morgan Stanley and HSBC for over 20 years.