AWS Cloud Operations & Migrations Blog

Automating Alerts for AWS Global Network Performance

Have your applications hosted on AWS ever experienced inter-Region or inter-Availability Zone (AZ) latency and you wanted to be proactively notified on these latency changes? This blog post describes an automated mechanism to set up those alarms. AWS has introduced the ability to understand the performance of the AWS Global Network by introducing Infrastructure Performance, a new capability in AWS Network Manager that helps you better understand the performance of the AWS Global Network. Using this new Infrastructure Performance capability, you can monitor the real-time inter-Region, inter-AZ, and intra-AZ latency of the AWS Global Network. For planning, you can view the 45-day historical trend of the AWS network performance. You view these network performance metrics on AWS Management Console, monitor them using Amazon CloudWatch, and stream them to your own monitoring tools. Figure 1 below is an example of inter-Region latency metric with anomaly band.

Figure 1: Example Inter-Region latency metric with anomaly band

In this blog, we will be automating the setup of CloudWatch subscriptions and alarms for inter-Region and inter-AZ latency. Monitoring AWS Global Network Performance blog describes details around Infrastructure Performance metrics and monitoring them using CloudWatch. This blog automates those recommendations and provides you the option to choose the regions and AZs that you would like to monitor and get alerted on when there is a latency change. The latency metrics set up by this solution are based on anomaly detection with p50 values over a period of 5 minutes. Figure 2 below is an example email of an inter-Region latency alarm.

Figure 2: Example email of an inter-region latency alarm

Solution Overview

The solution comes with two CloudFormation templates. The first template can be used to check latency between AWS Regions. The second template can be used to check latency between AWS Availability Zones within the same region. Both templates do not contain Gov or China regions or AZs. If you need to add new regions to the template as they become available, open the yaml and add them to the corresponding parameters section.

Note on Availability Zones – AWS maps the physical Availability Zones randomly to the Availability Zone names for each AWS account. This approach helps to distribute resources across the Availability Zones in an AWS Region, instead of resources likely being concentrated in Availability Zone “a” for each Region. Determine AZ mapping for your account using AZ IDs.

Deploying this solution

This solution and associated resources are available for you to deploy into your AWS account as AWS CloudFormation templates.

Prerequisites

For this walkthrough, you should have the following prerequisites:

What the CloudFormation template deploys

  • CloudWatchLatencyAlarm
  • CloudWatchLatencySubscription

How to deploy the CloudFormation template

For inter-Region latency alarms, the following steps show how to deploy the CloudFormation template:

  1. Download this yaml file.
  2. Navigate to the CloudFormation console in your AWS Account.
  3. Choose Create stack – With new resources (standard).
  4. Choose Template is ready, upload a template file, and navigate to the yaml file that you just downloaded.
  5. Choose Next.
  6. Give the stack a name (max. length 30 characters), and select Next.
  7. In the DestinationRegion drop-down, choose your destination region.
  8. Enter the ARN of your SNS topic to send alerts in the SNSTopic field. The SNS topic must be in the same region as you are logged into the console.
  9. In the SourceRegion drop-down, choose your source region.
  10. Select Next.
  11. (Optional) Enter tags and select Next.
  12. Click Submit.
  13. Wait for the stack creation to complete.

Once you run this solution in your account, the solution would setup inter-region latency alarms between the source and destination regions selected. If you would like to monitor additional inter-region latency, simply deploy the CloudFormation template again and choose different regions.

For inter-AZ latency alarms, following steps show how to deploy the CloudFormation template:

  1. Download this yaml file.
  2. Navigate to the CloudFormation console in your AWS Account.
  3. Choose Create stack – With new resources (standard).
  4. Choose Template is ready, upload a template file, and navigate to the yaml file that you just downloaded.
  5. Choose Next.
  6. Give the stack a name (max. length 30 characters), and select Next.
  7. In the DestinationAZ drop-down, choose your destination AZ.
  8. Enter the ARN of your SNS topic to send alerts in the SNSTopic field. The SNS topic must be in the same region as you are logged into the console.
  9. In the SourceAZ drop-down, choose your source AZ.
  10. Select Next.
  11. (Optional) Enter tags and select Next.
  12. Click Submit.
  13. Wait for the stack creation to complete.

Once you run this solution in your account, the solution would set up inter-AZ latency alarms between the source and destination AZ selected. This CloudFormation template would fail to deploy if source and destination AZs are from different regions. If you would like to monitor additional inter-AZ latency, simply deploy the CloudFormation template again and choose different AZs.

Costs

There is a cost associated with using this solution. You pay regular CloudWatch Metrics and Alarms costs for each inter-Region, inter-AZ, or intra-AZ pair metric that you publish to CloudWatch past the free tier.

Cleaning up

If you decide that you no longer want to keep the stack and associated resources, you can navigate to CloudFormation in the AWS Console, choose the stack (you will have named it when you deployed it), and choose Delete. All of the resources will be deleted.

Should you want to deploy this stack again, simply follow the steps above to re-deploy.

Conclusion

You can use this solution to automate the setup of Infrastructure Performance metrics and alarms between Regions and AZs of your choice. AWS Simple Notification Service(SNS) alerts can be sent to a multitude of protocols including email, SMS, and chat platforms such as Slack and Chime. This will ensure you quickly understand if the latency your application is experiencing is caused by AWS infrastructure performance or not.

About Authors

Austin Buettner

Austin Buettner

Austin Buettner is a Senior Technical Account Manager (TAM) at AWS. A technology enthusiast and former architect, he has over 15 years experience in IT. Outside of delighting AWS customers, he enjoys golfing, spending time with family, and hitting the gym.

Karthik Chemudupati

Karthik Chemudupati

Karthik Chemudupati is a Principal Technical Account Manager (TAM) with AWS, focused on helping customers achieve cost optimization and operational excellence. He has 20 years of IT experience in software engineering, cloud operations and automations. Karthik joined AWS in 2016 as a TAM and worked with more than dozen Enterprise Customers across US-West. Outside of work, he enjoys spending time with his family.