[SEO Subhead]
This Guidance shows how you can build an ultra-low latency online feature store using Amazon ElastiCache for Redis, a fully managed Redis service from AWS, and Feast, an open-source store framework. The online store uses machine learning (ML) for real-time data access and sub-millisecond latency. This Guidance covers a sample use case based on a real-time loan approval application that makes online predictions based on a customer’s credit scoring model.
Please note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
Set up data infrastructure to deploy Amazon Redshift, an Amazon Simple Storage Service (Amazon S3) bucket containing zip code and credit history parquet files, and AWS Identity and Access Management (IAM) roles. Additionally, set up policies for Amazon Redshift to access Amazon S3, and create an Amazon Redshift table that can query the parquet files.
Step 2
Deploy Feast infrastructure.
Step 3
Create a feature store repository, and configure Amazon ElastiCache as the online feature store and Amazon Redshift as the offline feature store. Create feature definitions.
Step 4
Register the feature definitions and the underlying infrastructure into a Feast registry using the Feast SDK.
Step 5
Generate training data using features and labels from the data and features from Feast. The features from Feast enrich the historical data and create a Feature DataFrame.
Step 6
Train the ML model using the training dataset and a model trainer.
Step 7
Ingest batch features into the ElastiCache online feature store. These online features are used to make online predictions with our trained model.
Step 8
Read feature vector from ElastiCache for making predictions.
Step 9
Use AWS Key Management Service (AWS KMS) to encrypt ElastiCache data at rest.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon CloudWatch enhances operational excellence for ElastiCache by providing comprehensive monitoring, logging, and automation capabilities. It tracks ElastiCache metrics like CPU utilization, memory usage, network traffic, command statistics, and cache hit/miss ratios, enabling proactive performance management. CloudWatch logs integration allows centralized log analysis, simplifying troubleshooting. CloudWatch alarms invoke automated actions, such as scaling ElastiCache clusters for optimal performance during traffic spikes while reducing costs during lulls.
-
Security
By scoping IAM policies to the minimum required permissions, unauthorized access to resources is limited. KMS provides control over encryption keys used to protect data, eliminating key management overhead. IAM policies are scoped to grant ElastiCache only the necessary permissions for operation. ElastiCache offers encryption in-transit and at-rest, while KMS allows you to create, manage, and control access to customer-managed encryption keys used for data protection.
-
Reliability
ElastiCache auto scaling groups ensure reliable performance by dynamically adjusting Redis cluster capacity (shards and replicas) based on utilization metrics. CloudWatch continuously monitors key metrics and initiates alarms to proactively detect and mitigate issues. Auto scaling handles traffic spikes by launching additional nodes, preventing overload and maintaining consistent performance. Nodes can be distributed across Availability Zones, enhancing redundancy against outages.
-
Performance Efficiency
ElastiCache auto scaling dynamically provisions and right-sizes Redis clusters based on demand for optimal resource utilization. During traffic spikes, auto scaling launches additional nodes to handle increased loads, preventing overloads and maintaining low latency.
ElastiCache features such as in-memory architecture, data structures, transactions, scripting, and clustering are optimized for high throughput and low latency operations, making it ideal for performance-critical workloads. Horizontal scaling and read replicas further boost throughput and response times.
Redis Cluster Mode shards data across multiple nodes, distributing memory and workload for improved parallelization and linear throughput scaling. Sharding maximizes memory utilization by overcoming single-node limits, while locally executing commands on shards minimizes network hops.
-
Cost Optimization
Auto scaling optimizes ElastiCache costs by automatically adjusting cluster capacity based on utilization metrics. During low traffic periods, it scales in by terminating unnecessary nodes, preventing overprovisioning and reducing operational expenses. Conversely, it launches additional nodes during traffic spikes, helping to ensure sufficient capacity without incurring excess costs. This elasticity eliminates the need for manual capacity management and helps ensure clusters are right-sized to workload demands, running only the required resources.
-
Sustainability
ElastiCache allows right-sizing caches to match application requirements, improving infrastructure efficiency and preventing resource waste.
The availability of multiple AWS Regions enables deploying ElastiCache clusters closer to end users, reducing network latency and data transfer and leading to lower energy consumption and emissions from reduced network usage.
Implementation Resources
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
Build an ultra-low latency online feature store for real-time inferencing using Amazon ElastiCache for Redis
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.