This Guidance shows how you can build an ultra-low latency online feature store using Amazon ElastiCache for Redis, a fully managed Redis service from AWS, and Feast, an open-source store framework. The online store uses machine learning (ML) for real-time data access and sub-millisecond latency. This Guidance covers a sample use case based on a real-time loan approval application that makes online predictions based on a customer’s credit scoring model.

Please note: [Disclaimer]

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF 

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

  • Amazon CloudWatch enhances operational excellence for ElastiCache by providing comprehensive monitoring, logging, and automation capabilities. It tracks ElastiCache metrics like CPU utilization, memory usage, network traffic, command statistics, and cache hit/miss ratios, enabling proactive performance management. CloudWatch logs integration allows centralized log analysis, simplifying troubleshooting. CloudWatch alarms invoke automated actions, such as scaling ElastiCache clusters for optimal performance during traffic spikes while reducing costs during lulls.

    Read the Operational Excellence whitepaper 
  • By scoping IAM policies to the minimum required permissions, unauthorized access to resources is limited. KMS provides control over encryption keys used to protect data, eliminating key management overhead. IAM policies are scoped to grant ElastiCache only the necessary permissions for operation. ElastiCache offers encryption in-transit and at-rest, while KMS allows you to create, manage, and control access to customer-managed encryption keys used for data protection.

    Read the Security whitepaper 
  • ElastiCache auto scaling groups ensure reliable performance by dynamically adjusting Redis cluster capacity (shards and replicas) based on utilization metrics. CloudWatch continuously monitors key metrics and initiates alarms to proactively detect and mitigate issues. Auto scaling handles traffic spikes by launching additional nodes, preventing overload and maintaining consistent performance. Nodes can be distributed across Availability Zones, enhancing redundancy against outages.

    Read the Reliability whitepaper 
  • ElastiCache auto scaling dynamically provisions and right-sizes Redis clusters based on demand for optimal resource utilization. During traffic spikes, auto scaling launches additional nodes to handle increased loads, preventing overloads and maintaining low latency.

    ElastiCache features such as in-memory architecture, data structures, transactions, scripting, and clustering are optimized for high throughput and low latency operations, making it ideal for performance-critical workloads. Horizontal scaling and read replicas further boost throughput and response times.

    Redis Cluster Mode shards data across multiple nodes, distributing memory and workload for improved parallelization and linear throughput scaling. Sharding maximizes memory utilization by overcoming single-node limits, while locally executing commands on shards minimizes network hops.

    Read the Performance Efficiency whitepaper 
  • Auto scaling optimizes ElastiCache costs by automatically adjusting cluster capacity based on utilization metrics. During low traffic periods, it scales in by terminating unnecessary nodes, preventing overprovisioning and reducing operational expenses. Conversely, it launches additional nodes during traffic spikes, helping to ensure sufficient capacity without incurring excess costs. This elasticity eliminates the need for manual capacity management and helps ensure clusters are right-sized to workload demands, running only the required resources.

    Read the Cost Optimization whitepaper 
  • ElastiCache allows right-sizing caches to match application requirements, improving infrastructure efficiency and preventing resource waste.

    The availability of multiple AWS Regions enables deploying ElastiCache clusters closer to end users, reducing network latency and data transfer and leading to lower energy consumption and emissions from reduced network usage.

    Read the Sustainability whitepaper 

Implementation Resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.


Build an ultra-low latency online feature store for real-time inferencing using Amazon ElastiCache for Redis

This blog post demonstrates how customers are building online feature stores on AWS with Amazon ElastiCache for Redis to support mission-critical ML use cases that require ultra-low latency.


The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.

Was this page helpful?