Guidance for Near Real-Time Fraud Detection Using Amazon Redshift Streaming Ingestion
Overview
How it works
Use machine learning models trained on historical data to combat financial fraud in near real-time.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
You can monitor your organization's operational health and notify operators of faults using Amazon CloudWatch. With this service, you can customize metrics, alarms, and dashboards. For more on how to gain insights into your operations, see the Amazon Redshift Streaming Ingestion developer guide.
Security
AWS has implemented a variety of services to ensure secure authentication and authorization that are compatible with this Guidance. These include:
- AWS Identity and Access Management (IAM)
- AWS IAM Identity Center (Successor to AWS Single Sign-On)
- AWS Certificate Manager (ACM)
- AWS Key Management Service (AWS KMS)
- Amazon Redshift role-based access control (RBAC)
These services are designed to provide secure access control and encryption of data for both people and machine access.
This Guidance recommends several AWS security services, such as IAM. It emphasizes the use of network security best practices, such as implementing network segmentation and controlling access with security groups and network access control lists (ACLs).
To protect data in this Guidance, AWS services such as Amazon Simple Storage Service (Amazon S3), AWS KMS, and AWS CloudTrail are used. Data is encrypted both in transit and at rest, and access to data is controlled by using IAM. CloudTrail logs all API activity to provide visibility into any unauthorized access attempts.
Furthermore, Amazon Redshift offers column-level and row-level access controls, as well as dynamic data masking to protect the data.
Reliability
This Guidance implements a reliable application-level architecture by leveraging AWS services such as Amazon Redshift Streaming Ingestion, which offers reliable data availability by storing data in Redshift Managed Storage (RMS). A Kinesis data stream also offers reliable data availability and retention up to 365 days by default.
Backup is available immediately for Amazon Redshift, as this service offers several fault tolerance levels within the service. And Amazon Redshift can be deployed in multiple Availability Zones, making services always available in case of a rare, but possible, Available Zone failure.
Performance Efficiency
This Guidance uses several services to meet the workload requirements of various scaling, traffic, and data access patterns. Kinesis Data Streams supports near real-time data ingestion. AWS Lambda and Amazon SNS are used for event-driven compute and messaging. Amazon Redshift is a service built for data warehousing and analytics. SageMaker is used to build, train, and deploy ML models. And Amazon Redshift ML supports predictive analytics. These AWS services allow this Guidance to meet workload requirements by providing scalable and efficient solutions for processing, analyzing, and storing data. Additionally, Amazon Redshift has an auto-scaling feature that ensures this Guidance can dynamically adjust resources to meet changing demand.
Cost Optimization
This Guidance primarily follows a serverless architecture, which automatically scales to match the demand and ensures only the required resources are used. Services like Lambda, Amazon API Gateway, and Amazon S3 provide the required infrastructure for the serverless architecture, and AWS Auto Scaling adjusts the capacity based on the workload. Additionally, services like CloudWatch help in optimizing the application and monitoring the resources.
Sustainability
This Guidance uses several AWS services to support data access and storage patterns. Kinesis Data Streams are used for near real-time data ingestion, while Lambda functions process the data and send notifications through Amazon SNS. Amazon Redshift is used for data warehousing and analytics, and SageMaker provides an environment for machine learning. Amazon Redshift ML is used for predictive analytics on the data stored in Amazon Redshift, allowing for the creation of models to support various data access and storage patterns.
Related Content
Near-real-time fraud detection using Amazon Redshift Streaming Ingestion with Amazon Kinesis Data Streams and Amazon Redshift ML
This post demonstrates how Amazon Redshift can deliver streaming ingestion and machine learning (ML) predictions all in one platform.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages