My Amazon CloudWatch alarm isn't triggered even though I can see from my CloudWatch graphs that the alarm metric exceeds the configured threshold. How can I be sure that my CloudWatch alarms are triggered and the alarm actions are performed?

CloudWatch alarms that measure time-aggregated metrics (such as five-minute averages) perform this measurement continuously in a rolling window. If any data points collected during the evaluation period don't exceed the configured threshold, the CloudWatch alarm isn't triggered.

CloudWatch alarms trigger actions only when the alarm state changes and is maintained for a specified number of periods. For more information, see Creating CloudWatch Alarms.

Important: There is an exception to this behavior for CloudWatch alarms that are associated with Amazon EC2 Auto Scaling actions. A CloudWatch alarm keeps triggering Auto Scaling actions when that alarm is in a specified state, even if there are no state changes and the alarm remains in that state.

Be sure to consider the mechanism used by CloudWatch to measure time-aggregated metrics when you create alarms.

Also consider lowering your metric data thresholds to be sure the alarm works as you expect.

Troubleshooting example

In this example, you have an alarm based on average CPU utilization. The alarm is configured with a threshold of > 45% for at least three consecutive periods of five minutes (with an evaluation period of three and a period of 300 seconds) for the following time-aggregated metrics:

  • 05:25:00: data: {Avg=61.123}
  • 05:30:00: data: {Avg=57.847}
  • 05:35:00: data: {Avg=60.503}
  • 05:40:00: data: {Avg=55.473}
  • 05:45:00: data: {Avg=41.685}
  • 05:50:00: data: {Avg=58.390}
  • 05:55:00: data: {Avg=57.846}
  • 06:00:00: data: {Avg=61.123}

These data points result in the following alarm states:

  • 05:35 ALARM
  • 05:40 ALARM
  • 05:45 ALARM to OK
  • 05:50 OK
  • 05:55 OK
  • 06:00 OK to ALARM

The data point collected at 05:55 exceeds the Average CPU Utilization threshold of 45%. However, the alarm remains in the OK state and doesn't trigger the action at 05:55. This happens because the data point collected at 05:45:00, which doesn't exceed the threshold, is included in evaluation at 05:55. However, five minutes later, the alarm triggers the action because the alarm state changes from OK to ALARM at 06:00.

For the following time-aggregated metrics, the alarm state is ALARM after 05:35 because all the data points exceed the Average CPU Utilization threshold of 45%. Because there are no state changes, the alarm action isn't triggered.

  • 05:25:00: data: {Avg=61.123}
  • 05:30:00: data: {Avg=57.847}
  • 05:35:00: data: {Avg=60.503}
  • 05:40:00: data: {Avg=55.473}
  • 05:45:00: data: {Avg=45.075}
  • 05:50:00: data: {Avg=58.390}
  • 05:55:00: data: {Avg=57.847}
  • 06:00:00: data: {Avg=61.123}

Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2018-10-31