[SEO Subhead]
This Guidance helps you build a comprehensive digital customer engagement solution by using the RudderStack customer data platform (CDP) and AWS services. RudderStack helps you collect and store customer data from various sources, such as mobile apps and websites, and makes that data available for analysis and engagement. When combined with AWS services, you can design a solution with customer 360 data, personalized recommendations, and marketing attribution metrics. It's easy to deploy, with a quick setup process that includes the tools you need to engage with your customers effectively and personally.
Please note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
Developers configure data sources (such as websites and mobile applications), data destinations, like Amazon Redshift, and connections in the RudderStack control plane.
Step 2
Developers use the software development kit (SDK) provided by RudderStack’s event stream to develop tracking for data sources.
Step 3
The SDK sends tracking events to Application Load Balancer, which then enters the RudderStack data plane deployed on Amazon Elastic Kubernetes Service (Amazon EKS). The data plane writes the events to the Amazon Simple Storage Service (Amazon S3) staging bucket.
Step 4
The RudderStack data plane periodically sends copy commands, data merge SQL, and data definition language (DDL) to Amazon Redshift Serverless, importing the event data files from the Amazon S3 staging bucket into Amazon Redshift tables.
Step 5
Using Amazon Redshift Serverless, the event table is processed according to analysis requirements to create user behavior analysis detail tables, summary tables, and user profile tables. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) for task scheduling.
Step 6
Use Amazon QuickSight to create dashboards like user behavior analysis, web attribution reports, and funnel analysis, with the data source being the summary level tables read through Amazon Redshift Serverless.
Step 7
The interaction data between users and items are sent in real-time as events from the RudderStack data plane to Amazon Personalize. Based on different recommendation algorithms, corresponding recommendation results are generated.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon EKS is a managed service that makes it easy to run Kubernetes on AWS without needing to install, operate, and maintain your own Kubernetes control plane or nodes.
We recommend using an AWS Cloud Development Kit (AWS CDK), a framework to define cloud infrastructure as code (IaC) and provision it through AWS CloudFormation. AWS CDK helps you to standardize your infrastructure and share it as code, making it easier to manage, more reliable, and quicker to deploy. It also allows for version control of the infrastructure, which aids in tracking changes, identifying issues, and rolling back when necessary.
We also recommend using Amazon CloudWatch Container Insights to collect, aggregate, and summarize metrics and logs from your containerized applications. Visualizing and analyzing applications using Container Insights helps you to identify performance bottlenecks, isolate issues, and resolve them quickly.
-
Security
AWS Identity and Access Management (IAM) controls access to files on Amazon S3, the Amazon Redshift cluster, and data insights by Quicksight through granular permissions based on roles. It enables secure access control for AWS services through identity federation, least-privilege permissions, temporary credentials, integration with services, and detailed audit logging of access. It provides centralized identity and access management across AWS.
-
Reliability
Elastic Load Balancing (ELB) automatically distributes incoming traffic across multiple targets and availability zones. It performs health checks on targets and only sends traffic to healthy ones. This provides high availability and fault tolerance. If an instance fails, the ELB reroutes traffic to remaining healthy instances. Amazon EKS integrates seamlessly with ELB for ingress traffic, and supports automated application failover across Availability Zones.
The use of Amazon S3 as a serverless service across multiple Availability Zones provides redundancy in the event of a failure in any single Availability Zone. Event source data is staged on Amazon S3 before being copied to Amazon Redshift, with Amazon S3 providing high availability for the staged data.
-
Performance Efficiency
Amazon EKS automatically scales your applications in response to demand, ensuring efficiency regardless of the workload or user volume; it also allows users to set resource requests and limits for pods and containers.
Additionally, Amazon EKS enables horizontal scaling of applications by automatically managing the deployment and scaling of containers based on resource utilization. This allows applications to handle increased workloads efficiently, helping to ensuring optimal performance during peak times.
Amazon EKS also provides built-in load balancing capabilities by distributing incoming traffic across multiple containers or pods so the workload is evenly distributed. It also prevents any single container from becoming a performance bottleneck.
-
Cost Optimization
Amazon Redshift Serverless is a fully managed, serverless data warehouse that makes it easy to analyze all of your data with standard SQL and your existing business intelligence (BI) tools.
With Amazon Redshift Serverless, you pay only for the queries you run. There are no upfront costs, no ongoing costs, and no costs for idle capacity. This can result in significant cost savings compared to traditional, provisioned data warehouses.
-
Sustainability
Amazon Redshift Serverless and Amazon S3 help improve sustainability by providing a serverless data warehouse that automatically scales up or down based on workload demands. This eliminates the need to provision and manage clusters, reducing resource waste.
The Amazon S3 Intelligent-Tiering storage class offers durable object storage with minimal overhead, optimizing storage usage and costs. Together they enable pay-as-you-go analytics with high efficiency and low waste.
Serverless services, like Amazon Redshift Serverless, only utilize resources as necessary to help reduce our carbon footprint. And AWS data centers are designed for energy efficiency that provide the efficient, resilient service you expect, while minimizing our environmental footprint.
Implementation Resources
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.