AWS Database Blog

Category: Customer Solutions

How Alight Solutions achieved 60% cost savings with Amazon ElastiCache for Valkey

Alight Solutions is a leading cloud-based human capital technology and services provider that has focused its operations on integrated benefits administration, healthcare navigation, and employee experience solutions. In this post, we share how Alight Solutions transformed their caching infrastructure using ElastiCache while maintaining strict performance requirements, achieving over 60% cost reduction, 70-80% reduction in operational overhead, migration of gigabytes of data with sub-0.5 millisecond performance for millions of users, and a 99.99% reduction in incident rate.

How Omnissa saved millions by migrating to Amazon RDS and Amazon EC2

Omnissa is a digital workspace technology leader that delivers smart, seamless, and secure digital work experiences for organizations worldwide. It serves 26,000 customers, including the top seven of the Fortune 500 companies. In this post, we walk through the Omnissa’s journey of migrating its mission-critical UEM platform and self-managed SQL Server workloads from VMware Cloud on AWS (VMC-A) to Amazon RDS for SQL Server and its application servers to Amazon EC2.

How Zepto scales to millions of orders per day using Amazon DynamoDB

In this post, we describe how Zepto transformed its data infrastructure from a centralized relational database to a distributed system for select use cases. We discuss the challenges encountered with Zepto’s original architecture to support the business scale, the shift towards using key-value storage for cases where eventual consistency was acceptable, and Zepto’s adoption of Amazon DynamoDB.

GroundTruth reduces costs by 45% and improves reliability migrating from Aerospike to Amazon ElastiCache for Valkey

GroundTruth, an advertising platform leading the way in location- and behavior-based marketing, empowers brands to connect with consumers through real-world behavioral data to drive real business results. As our advertising platform scaled to process increased volume of ad requests and third-party segment ingestion, maintaining our Aerospike-based caching infrastructure introduced significant operational complexity and rising costs, while also compromising performance and limiting our ability to scale efficiently. To meet our requirements we implemented Amazon ElastiCache for Valkey, which streamlined our operations, improved reliability, and reduced costs. In this post, we walk through our migration journey, covering the migration strategy we adopted, the optimizations we made to reduce cost by 45%, reliability improvements including reducing write failures by 20x, and operational gains from managed service capabilities.

Building a 10-billion wallet crypto-intelligence platform: Elliptic’s journey with Amazon DynamoDB

In this post, we explore how Elliptic uses Amazon DynamoDB to build a crypto-intelligence platform that scales to over 10 billion wallets globally and supports real-time risk detection across the fast-evolving digital asset ecosystem. We discuss the data model design, indexing strategies, and operational setup that Elliptic uses to power real-time risk analysis and complex investigations at scale.

How Smartsheet enhances recommendations using Amazon Neptune and Knowledge Graphs

Smartsheet is a leading SaaS-based collaborative work management platform trusted by enterprises worldwide to manage projects, automate workflows, and drive collaboration at scale. In this post, we describe the Smartsheet Knowledge Graph, built in partnership between Smartsheet and AWS. The Smartsheet Knowledge Graph is a unified data model connecting people, content, and work in Smartsheet, representing how users interact with assets, content, and their collaborators.

How Global Payments Inc. improved their tail latency using request hedging with Amazon DynamoDB

Amazon DynamoDB delivers consistent single-digit millisecond performance at any scale, making it ideal for mission-critical workloads. However, as with any distributed system, a small percentage of requests may experience significantly longer response times than the average. This phenomenon, known as tail latency, refers to these slower outliers that can be seen by looking at metrics such as the 99th or 99.9th percentile of response times. In this post, we explore how Global Payments Inc. (GPN) reduced their tail latency by 30% using request hedging. We review the technical details and challenges they faced, providing insights into how you can optimize your own latency-sensitive applications. In a next post we’ll share detailed implementation examples.

Beyond Correlation: Finding Root-Causes using a network digital twin graph and agentic AI

When your network fails, finding the root cause usually takes hours of investigations, going through correlated alarms that often lead to symptoms rather than the actual problem. Root-cause analysis (RCA) systems are often built on hardcoded rules, static thresholds, and pre-defined patterns that work great until they don’t. Whether you’re troubleshooting network-level outages or service-level degradations, those rigid rule sets can’t adapt to cascading failures and complex interdependencies. In this post, we show you our AWS solution architecture that features a network digital twin using graphs and Agentic AI. We also share four runbook design patterns for Agentic AI-powered graph-based RCA on AWS. Finally, we show how DOCOMO provides real-world validation from their commercial networks of our first runbook design pattern, showing drastic MTTD improvement with 15s for failure isolation in transport and Radio Access Networks.

Scaling transaction peaks: Juspay’s approach using Amazon ElastiCache

Juspay powers global enterprises by streamlining payment process orchestration, enhancing security, reducing fraud, and providing seamless customer experiences. In this post, we walk you through how Juspay transformed their payment processing architecture to handle transaction peaks. Using Amazon ElastiCache and Amazon RDS for MySQL, Juspay built a system that processes 7.6 million transactions per hour during peak events, achieves sub-millisecond latency, and reduces infrastructure costs by 80% compared to their previous solution.

How Wiz achieved near-zero downtime for Amazon Aurora PostgreSQL major version upgrades at scale using Aurora Blue/Green Deployments

Wiz, a leading cloud security company, identifies and removes risks across major cloud platforms. Our agent-less scanner processes tens of billions of daily cloud resource metadata entries. This demands high-performance, low-latency processing, making our Amazon Aurora PostgreSQL-Compatible Edition database, serving hundreds of microservices at scale, a critical component of our architecture. In this post, we share how we upgraded our Aurora PostgreSQL database from version 14 to 16 with near-zero downtime using Amazon Aurora Blue/Green Deployments.