AWS Big Data Blog
Analyze Amazon SES events at scale using Amazon Redshift
Email is one of the most important methods for business communication across many organizations. It’s also one of the primary methods for many businesses to communicate with their customers. With the ever-increasing necessity to send emails at scale, monitoring and analysis has become a major challenge.
Amazon Simple Email Service (Amazon SES) is a cost-effective, flexible, and scalable email service that enables you to send and receive emails from your applications. You can use Amazon SES for several use cases, such as transactional, marketing, or mass email communications.
An important benefit of Amazon SES is its native integration with other AWS services, such as Amazon CloudWatch and Amazon Redshift, which allows you to monitor and analyze your emails sending at scale seamlessly. You can store your email events in Amazon Redshift, which is a widely used, fast, and fully managed cloud data warehouse. You can then analyze these events using SQL to gain business insights such as marketing campaign success, email bounces, complaints, and so on.
In this post, you will learn how to implement an end-to-end solution to automate this email analysis and monitoring process.
The following architecture diagram highlights the end-to-end solution, which you can provision automatically with an AWS CloudFormation template.
In this solution, you publish Amazon SES email events to an Amazon Kinesis Data Firehose delivery stream that publishes data to Amazon Redshift. You then connect to the Amazon Redshift database and use a SQL query tool to analyze Amazon SES email events that meet the given criteria. We use the Amazon Redshift SUPER data type to store the event (JSON data) in Amazon Redshift. The SUPER data type handles semi-structured data, which can have varying table attributes and types.
The alarm system uses Amazon CloudWatch logs that Kinesis Data Firehose generates when a data load to Amazon Redshift fails. We have set up a metric filter that pattern matches the CloudWatch log events to determine the error condition and triggers a CloudWatch alarm. This in turn sends out email notifications using Amazon Simple Notification Service (Amazon SNS).
As a prerequisite for deploying the solution in this post, you need to set up Amazon SES in your account. For more information, see Getting Started with Amazon Simple Email Service.
Solution resources and features
The architecture built by AWS CloudFormation supports AWS best practices for high availability and security. The CloudFormation template takes care of the following key resources and features:
- Amazon Redshift cluster – An Amazon Redshift cluster with encryption at rest enabled using an AWS Key Management Service (AWS KMS) customer managed key (CMK). This cluster acts as the destination for Kinesis Data Firehose and stores all the Amazon SES email sending events in the table
ses, as shown in the following screenshot.
- Kinesis Data Firehose configuration – A Kinesis Data Firehose delivery stream that acts as the event destination for all Amazon SES email sending metrics. The delivery stream is set up with Amazon Redshift as the destination. Server-side encryption is enabled using an AWS KMS CMK, and destination error logging has been enabled as per best practices.
- Amazon SES configuration – A configuration set in Amazon SES that is used to map Kinesis Data Firehose as the event destination to publish email metrics.
To use the configuration set when sending emails, you can specify a default configuration set for your verified identity, or include a reference to the configuration set in the headers of the email.
- Exploring and analyzing the data – We use Amazon Redshift query editor v2 for exploring and analyzing the data.
- Alarms and notifications for ingestion failures – A data load error notification system using CloudWatch and Amazon SNS generates email-based notifications in the event of a failure during data load from Kinesis Data Firehose to Amazon Redshift. The setup creates a CloudWatch log metric filter, as shown in the following screenshot.
A CloudWatch alarm based on the metric filter triggers an SNS notification when in alarm state. For more information, see Using Amazon CloudWatch alarms.
Deploy the CloudFormation template
The provided CloudFormation template automatically creates all the required resources for this solution in your AWS account. For more information, see Getting started with AWS CloudFormation.
- Sign in to the AWS Management Console.
- Choose Launch Stack to launch AWS CloudFormation in your AWS account:
- For Stack name, enter a meaningful name for the stack, for example,
- Provide the following values for the stack parameters:
- ClusterName – The name of the Amazon Redshift cluster.
- DatabaseName – The name of the first database to be created when the Amazon Redshift cluster is created.
- DeliveryStreamName – The name of the Firehose delivery stream.
- MasterUsername – The user name that is associated with the primary user account for the Amazon Redshift cluster.
- NodeType – The type of node to be provisioned. (Default dc2.large)
- NotificationEmailId – The email notification list that is used to configure an SNS topic for sending CloudWatch alarm and event notifications.
- NumberofNodes – The number of compute nodes in the Amazon Redshift cluster. For multi-node clusters, the
NumberofNodesparameter must be greater than 1.
- OnPremisesCIDR – IP range (CIDR notation) for your existing infrastructure to access the target and replica Amazon Redshift clusters.
- SESConfigSetName – Name of the Amazon SES configuration set.
- SubnetId – Subnet ID where source Amazon Redshift cluster is created.
- Vpc – VPC in which Amazon Redshift cluster is launched.
- Choose Next.
- Review all the information and select I acknowledge that AWS CloudFormation might create IAM resources.
- Choose Create stack.
You can track the progress of the stack creation on the Events tab. Wait for the stack to complete and show the status
Test the solution
To send a test email, we use the Amazon SES mailbox simulator. Set the
configuration-set header to the one created by the CloudFormation template.
We use the Amazon Redshift query editor V2 to query the Amazon Redshift table (created by the CloudFormation template) and see if the events have shown up.
If the data load of the event stream fails from Kinesis Data Firehose to Amazon Redshift, the failure notification system is triggered, and you receive an email notification via Amazon SNS.
Some of the AWS resources deployed by the CloudFormation stacks in this post incur a cost as long as you continue to use them.
You can delete the CloudFormation stack to delete all AWS resources created by the stack. To clean up all your stacks, use the AWS CloudFormation console to remove the stacks that you created in reverse order.
- On the Stacks page on the AWS CloudFormation console, choose the stack to delete.
- In the stack details pane, choose Delete.
- Choose Delete stack when prompted.
After stack deletion begins, you can’t stop it. The stack proceeds to the
DELETE_IN_PROGRESS state. When the stack deletion is complete, the stack changes to the
DELETE_COMPLETE state. The AWS CloudFormation console doesn’t display stacks in the
DELETE_COMPLETE state by default. To display deleted stacks, you must change the stack view filter. For more information, see Viewing deleted stacks on the AWS CloudFormation console.
If the delete fails, the stack enters the
DELETE_FAILED state. For solutions, see Delete stack fails.
In this post, we walked through the process of setting up Amazon SES and Amazon Redshift to deploy an email reporting service that can scale to support millions of events. We used Amazon Redshift to store semi-structured messages using the SUPER data type in database tables to support varying message sizes and formats. With this solution, you can easily run analytics at scale and analyze your email event data for deliverability-related issues such as bounces or complaints.
Use the CloudFormation template provided to speed up provisioning of the cloud resources required for the solution (Amazon SES, Kinesis Data Firehose, and Amazon Redshift) in your account while following security best practices. Then you can analyze Amazon SES events at scale using Amazon Redshift.
About the Authors
Manash Deb is a Software Development Engineer in the AWS Directory Service team. He has worked on building end-to-end applications in different database and technologies for over 15 years. He loves to learn new technologies and solving, automating, and simplifying customer problems on AWS.
Arnab Ghosh is a Solutions Architect for AWS in North America helping enterprise customers build resilient and cost-efficient architectures. He has over 13 years of experience in architecting, designing, and developing enterprise applications solving complex business problems.
Sanjoy Thanneer is a Sr. Technical Account Manager with AWS based out of New York. He has over 20 years of experience working in Database and Analytics Domains. He is passionate about helping enterprise customers build scalable , resilient and cost efficient Applications.
Justin Morris is a Email Deliverability Manager for the Simple Email Service team. With over 10 years of experience in the IT industry, he has developed a natural talent for diagnosing and resolving customer issues and continuously looks for growth opportunities to learn new technologies and services.