AWS Cloud Operations & Migrations Blog
Create a metric math alarm using Amazon CloudWatch
In 2018 we launched metric math, which enables you to perform calculations across multiple metrics for real-time analysis. You can visualise these computed metrics through the Amazon CloudWatch console, add them to CloudWatch Dashboards, or retrieve through the newly launched GetMetricData API. You can use metric math to derive insights from your existing CloudWatch metrics and better understand the operational health and performance of your infrastructure.
At AWS re:Invent 2018 we announced the ability to create an alarm for metric math expressions.
In this blog post, you’ll create an alarm for a metric math expression that calculates the AWS Lambda error rate. You want to have an alarm on AWS Lambda errors, but you want to allow a small number of errors without triggering your alarm. We can use metric math to create an error rate expression in the form of a percentage. For that, you’ll divide the Errors metric by the Invocations metric to get an error rate, create an alarm and add the resulting time series to a graph on your CloudWatch dashboard. Expression = error / requests * 100.
Other use cases include the following:
1. Total billing from different AWS services such as Amazon EC2, CloudWatch and Amazon DynamoDB. Expression: e1 = m1 + m2 + m3.
2. The percentage of unhealthy Elastic Load Balancing (ELB) hosts using HealthyHostCount and UnHealthyHostCount metrics. Expression: unhealthy / (healthy + unhealthy) * 100.
You’ll use Amazon CloudWatch for this blog post. If you qualify, the service is within the AWS Free Tier.
Step 1 Create a CloudWatch alarm
1. Open the Amazon CloudWatch console. In the navigation pane, choose Alarms, Create Alarm.
2. Choose Select Metric.
3. Choose Lambda service namespace.
4. Choose By Resource.
5. When a list of metrics is displayed, select the check box for Errors and Invocations (requests) metrics.
6. To add another metric to use in the math expression, under All metrics, choose All, find the specific metric, and then select the check box next to it. You can add up to 10 metrics.
7. Choose Graphed metrics.
8. For each metric added, do the following:
a. Under Statistic, choose one of the statistics or predefined percentiles, or specify a custom percentile, this case Sum.
b. Under Period, choose the evaluation period for the alarm. Note that all metrics must have the same period. When evaluating the alarm, each period is aggregated into one data point, in this case 5 Minutes.
9. Choose Add a math expression.
Step 2. Create a metric math expression
After choosing Add a math expression a new row appears for the expression. Type the expression in the Details field that calculates the percentage of errors against the total number of requests. For more information, see Metric Math Syntax and Functions.
1. To use a metric or the result of another expression as part of the formula for this expression, use the value shown in the Id column. You can change the value of Id. It can include numbers, letters, and underscores, and must start with a lowercase letter. Changing the value of Id to a more meaningful name can also make the alarm graph easier to understand.
2. When you have the expression to use for the alarm, clear the check boxes to the left of every other expression and every metric on the page. Only the check box next to your error rate expression should be selected. The expression you choose for the alarm must produce a single time series, and show only one line on the graph.
3. Then choose Select metric.
4. Choose a name and description for the alarm. The name must contain only ASCII characters.
5. For Whenever, specify the alarm condition.
a. For is:, specify whether the expression result must be greater than, less than, or equal to the threshold, and specify the threshold value.
b. For for:, specify how many evaluation periods (data points) must be in the ALARM state to trigger the alarm. Initially, you can change only the second value, and the first value changes to match your entry. This creates an alarm that goes to the ALARM state if that many consecutive periods are breaching.
To create an M out of N alarm, choose the pencil icon. You can then change the M number to be different than the N number. For more information, see Evaluating an Alarm.
6. Under Additional settings, for Treat missing data as, choose how to have the alarm behave when some data points are missing. For more information, see Configuring How CloudWatch Alarms Treat Missing Data.
7. Under Actions, select the type of action to have the alarm to perform when the alarm is triggered. Choose +Notification or +AutoScaling Action to have the alarm perform multiple actions. Specify at least one action.
8. Finally, choose Create Alarm to save your alarm.
Step 3. Create a CloudWatch Dashboard
1. After the metric math alarm has been created, select the alarm check box and choose Add to Dashboard.
2. Select a dashboard, the widget type and the widget title. Choose Add to dashboard.
Congratulations! You have successfully created an alarm to watch a metric math expression. To learn more about how metric math alarms are priced visit the CloudWatch pricing site.
You can use AWS CloudFormation templates to create and modify metric math alarms consistently across your AWS resources and applications. Recently triggered alarms on metric expressions alarms are instantly added to CloudWatch Automatic Dashboards, accelerating root cause analysis.
About the Author
Javier Martin is a Senior Product Manager for Amazon CloudWatch. Javier loves building products in AWS that help customers monitoring their systems and applications.