AWS Architecture Blog
Category: Customer Solutions
How CommBank made their CommSec trading platform highly available and operationally resilient
In this post, we explore how CommSec, Australia’s leading online broker, transitioned from a multicloud environment to AWS as their sole cloud provider while implementing Amazon Application Recovery Controller (ARC) zonal shift to maintain high availability and operational resilience. The consolidation resulted in significant benefits including 25% base capacity reduction, two times faster deployments, and improved failover capabilities through ARC zonal shift, enabling CommSec to continue serving millions of customers while meeting strict regulatory requirements.
How Karrot built a feature platform on AWS, Part 1: Motivation and feature serving
This two-part series shows how Karrot developed a new feature platform, which consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline. This post starts by presenting our motivation, our requirements, and the solution architecture, focusing on feature serving.
How Karrot built a feature platform on AWS, Part 2: Feature ingestion
This two-part series shows how Karrot developed a new feature platform, which consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline. This post covers the process of collecting features in real-time and batch ingestion into an online store, and the technical approaches for stable operation.
How Zapier runs isolated tasks on AWS Lambda and upgrades functions at scale
In this post, you’ll learn how Zapier has built their serverless architecture focusing on three key aspects: using Lambda functions to build isolated Zaps, operating over a hundred thousand Lambda functions through Zapier’s control plane infrastructure, and enhancing security posture while reducing maintenance efforts by introducing automated function upgrades and cleanup workflows into their platform architecture.
How HashiCorp made cross-Region switchover seamless with Amazon Application Recovery Controller
In this post, we discuss HashiCorp’s journey from manual, stress-inducing failover procedures to a streamlined, confident approach that fundamentally changed how they deliver on their enterprise-grade resilience promises.
How Scale to Win uses AWS WAF to block DDoS events
In this post, you’ll learn how Scale to Win configured their network topology and AWS WAF to protect against DDoS events that reached peaks of over 2 million requests per second during the 2024 US presidential election campaign season. The post details how they implemented comprehensive DDoS protection by segmenting human and machine traffic, using tiered rate limits with CAPTCHA, and preventing CAPTCHA token reuse through AWS WAF Bot Control.
Build a multi-Region AWS PrivateLink backed service with seamless failover
This post demonstrates how the Issuer Solutions business of Global Payments, as a service provider, implemented cross-Region failover for an AWS PrivateLink backed service exposed to their customers. Their solution enables failover to a secondary Region without customer coordination, reducing Recovery Time Objective (RTO).
How Stellantis streamlines floating license management with serverless orchestration on AWS
In this post, we explore a unique scenario where an ISV, unable to provide a floating license option for cloud usage, worked with Stellantis to develop an alternative solution. This approach, implemented with the ISV’s permission, treats named user licenses as if they were floating, automatically assigning and removing them based on the state of user workbench instances.
How Launchpad from Pega enables secure SaaS extensibility with AWS Lambda
In this post, we share how Pegasystems (Pega) built Launchpad, its new SaaS development platform, to solve a core challenge in multi-tenant environments: enabling secure customer customization. By running tenant code in isolated environments with AWS Lambda, Launchpad offers its customers a secure, scalable foundation, eliminating the need for bespoke code customizations.
Improving platform resilience at Cash App
Cash App, a leading peer-to-peer payments and digital wallet service from Block, Inc., has implemented resilience improvements across the entire technology stack. In this post, we discuss how Cash App improved the resilience of its compute platform built on Amazon Elastic Kubernetes Service (Amazon EKS) by implementing a dual-cluster topology to reduce single points of failure. We also discuss how Cash App used AWS Fault Injection Service (AWS FIS) to conduct an Availability Zone power interruption scenario in non-production environments, preparing the platform team for real-world failures and ongoing