AWS Solutions Library

AWS Solutions Library›
Guidance for Ultra-Low Latency, Machine Learning Feature Stores on AWS

Guidance for Ultra-Low Latency, Machine Learning Feature Stores on AWS

Go to sample code

Overview

This Guidance shows how you can build an ultra-low latency online feature store using Amazon ElastiCache for Redis, a fully managed Redis service from AWS, and Feast, an open-source store framework. The online store uses machine learning (ML) for real-time data access and sub-millisecond latency. This Guidance covers a sample use case based on a real-time loan approval application that makes online predictions based on a customer’s credit scoring model.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Download the architecture diagram

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Amazon CloudWatch enhances operational excellence for ElastiCache by providing comprehensive monitoring, logging, and automation capabilities. It tracks ElastiCache metrics like CPU utilization, memory usage, network traffic, command statistics, and cache hit/miss ratios, enabling proactive performance management. CloudWatch logs integration allows centralized log analysis, simplifying troubleshooting. CloudWatch alarms invoke automated actions, such as scaling ElastiCache clusters for optimal performance during traffic spikes while reducing costs during lulls.

Read the Operational Excellence whitepaper

By scoping IAM policies to the minimum required permissions, unauthorized access to resources is limited. KMS provides control over encryption keys used to protect data, eliminating key management overhead. IAM policies are scoped to grant ElastiCache only the necessary permissions for operation. ElastiCache offers encryption in-transit and at-rest, while KMS allows you to create, manage, and control access to customer-managed encryption keys used for data protection.

Read the Security whitepaper

ElastiCache auto scaling groups ensure reliable performance by dynamically adjusting Redis cluster capacity (shards and replicas) based on utilization metrics. CloudWatch continuously monitors key metrics and initiates alarms to proactively detect and mitigate issues. Auto scaling handles traffic spikes by launching additional nodes, preventing overload and maintaining consistent performance. Nodes can be distributed across Availability Zones, enhancing redundancy against outages.

Read the Reliability whitepaper

ElastiCache auto scaling dynamically provisions and right-sizes Redis clusters based on demand for optimal resource utilization. During traffic spikes, auto scaling launches additional nodes to handle increased loads, preventing overloads and maintaining low latency.

ElastiCache features such as in-memory architecture, data structures, transactions, scripting, and clustering are optimized for high throughput and low latency operations, making it ideal for performance-critical workloads. Horizontal scaling and read replicas further boost throughput and response times.

Redis Cluster Mode shards data across multiple nodes, distributing memory and workload for improved parallelization and linear throughput scaling. Sharding maximizes memory utilization by overcoming single-node limits, while locally executing commands on shards minimizes network hops.

Read the Performance Efficiency whitepaper

Auto scaling optimizes ElastiCache costs by automatically adjusting cluster capacity based on utilization metrics. During low traffic periods, it scales in by terminating unnecessary nodes, preventing overprovisioning and reducing operational expenses. Conversely, it launches additional nodes during traffic spikes, helping to ensure sufficient capacity without incurring excess costs. This elasticity eliminates the need for manual capacity management and helps ensure clusters are right-sized to workload demands, running only the required resources.

Read the Cost Optimization whitepaper

ElastiCache allows right-sizing caches to match application requirements, improving infrastructure efficiency and preventing resource waste.

The availability of multiple AWS Regions enables deploying ElastiCache clusters closer to end users, reducing network latency and data transfer and leading to lower energy consumption and emissions from reduced network usage.

Read the Sustainability whitepaper

Deploy with confidence

Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs.

Go to sample code

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages

Guidance for Ultra-Low Latency, Machine Learning Feature Stores on AWS

Overview

How it works

Well-Architected Pillars

Deploy with confidence

Disclaimer

Did you find what you were looking for today?

Learn

Resources

Developers

Help

Guidance for Ultra-Low Latency, Machine Learning Feature Stores on AWS

Overview

How it works

Well-Architected Pillars

Operational Excellence

Security

Reliability

Performance Efficiency

Cost Optimization

Sustainability

Deploy with confidence

Related Content

Disclaimer

Did you find what you were looking for today?

Learn

Resources

Developers

Help