Amazon CloudWatch Events and metrics for AWS Backup
Customers who use AWS Backup frequently ask, “How do I know if my backup job has failed?” or “How can I be proactively notified of a change to my backup vault settings?” With the recent integration of CloudWatch Events for AWS Backup, we can now deliver a real-time stream of events that describe changes to your AWS Backup resources. When used with Amazon CloudWatch Events and metrics, you can monitor and log AWS Backup events to help you support your regulatory compliance reporting obligations and meet your business continuity SLAs.
This blog walks you through how to set up monitoring of events and alarms using a combination of Amazon CloudWatch, Amazon EventBridge, and AWS Backup. I also discuss:
- The value of effectively gathering event logs for reporting and auditing purposes.
- Triggering alarms for state change events like the manual deletion of backups by a compromised user.
- Setting up metrics for monitoring.
Consider a state change happening within AWS Backup. You can now capture that event in CloudWatch or EventBridge to describe the state changes to your backup vaults, backup plans, backup jobs, copy jobs, restore jobs, and recovery points. This enables you to better monitor and, more importantly, take requisite action on these resources to improve your monitoring and operations responses.
For instance, when a backup job fails, you can send an event to an Amazon SNS queue that triggers an email to the backup administrator to notify them of the event. Additional configurations can allow you to monitor these state changes without having to look at the AWS Backup console for state changes. These enhancements contribute to your event-driven computing framework and enable you to react to the events in a quicker manner based on these state changes. For example, you can trigger AWS Lambda in the event that a cross-region copy fails, or send a notification to another application when the backup completes. All this functionality enables you to build alarms with these notifications in order to be more responsive and proactive with your backup planning and management.
Enabling CloudWatch for AWS Backup tutorial
This section provides guidance on how to get started using CloudWatch and EventBridge monitoring of AWS Backup.
When you access the AWS Backup console, in the upper left corner you will see a menu icon consisting of three parallel horizontal lines. Open up the menu and select CloudWatch on the left rail to open the Amazon CloudWatch console.
While in the Amazon CloudWatch console, configure a rule to consume the new state change events coming from AWS Backup.
On the left rail of the CloudWatch console, under the Events section, select Rules. Then select the blue Create rule button.
Next, under the Event Source section, in the Service Name dropdown, select Backup. Then, in the Event Type dropdown, choose one of the predefined selections. In this example, I chose Backup Job State Change.
On the same console screen on the right side, under Targets, select the Add target button to invoke when an event matches your event pattern or when the schedule should trigger.
Available selections in the targets dropdown include choices such as Lambda functions, SNS topic, SQS queue, etc. Select your target to complete your selection. In this example, I selected my “CODETEST…” Lambda.
To view the output of these event logs, under the Logs section on the left rail menu, select Log groups.
Select the log item you want to review and expand for additional detail. The following screenshot is an example of a “Backup Plan State Change” log event.
Details can include the resource ARN of the backup plan, the
“State”: “Modified” or
“modifiedAt” = “timestamp” of the modification, in addition to other useful details that you can use to trigger additional events.
In the following example, when I select Log Groups and search for “Recovery Point State Change,” CloudWatch filters for all of my recovery point state change events. The details of this state change notification highlight that this recovery point has a status of
DELETED, what time the deletion occurred, and that the deletion is a
MANUAL_DELETE, which means that a user performed the action instead of normal lifecycle management processes. This can be highly useful if you are trying to monitor backup vaults for unintentional or malicious actions performed by users.
A manually deleted backup is something that you may want to monitor and keep track of in case of accidental deletion or malicious intent, and now you can. Additionally, these capabilities are now available using Amazon EventBridge (as discussed later in this post).
How to configure Amazon CloudWatch metrics for AWS Backup
Next, let us review the setup and configuration of Amazon CloudWatch metrics. These metrics are a time-ordered set of data points published when a service has certain state changes, such as when a backup job successfully completes.
In the Amazon CloudWatch console, on the left rail in the menu, select Metrics and you will see the All metrics tab. Under AWS Namespaces select Backup, which leads you to a list of the metric dimensions that are available to create a graph.
Next select the By Resource Type namespace. The resource type namespace enables you to see the resources that are monitored in the following graph.
In this example, I am measuring the NumberOfCopyJobsCompleted resource type over time for Amazon EBS.
There are multiple dimensions for each metric. You can filter the number of backup jobs or copy jobs created by backup vault name or resource type, or both. These will be the same metric but aggregated over different dimensions.
Once configured, you can set up metrics to measure a number of different event types, such as the number of backup and restore jobs currently running or completed. This makes a great addition to any dashboards you are monitoring in your network operations center. You can also create an alarm for each of these metrics.
Notifications via CloudWatch alarms
Next, let us review how to receive proactive notifications via CloudWatch alarms. In the CloudWatch console, on the left rail in the menu, select Metrics and you will see the All metrics tab by default.
Select the Graphed metrics tab, and then select the bell icon on the right under Actions for any metric that you want to create an alert for.
This selection launches the wizard to create a CloudWatch alarm.
For Step 1 in the wizard, specify the metric and set the condition that you want the alarm to trigger. In this example, I accept the default entries in the Metric section.
Next, set the condition for the trigger of the alarm. Select threshold type Static, and configure the trigger for whenever the number of copy jobs completed is lower than one, as shown in the following screenshot:
In Step 2, you can configure your actions. To configure this notification, you must set up and configure this action by selecting an Amazon SNS topic.
You can select a default topic or define a new SNS topic. To define a new topic select Create new topic, fill in the topic name, and fill in the email alias of the team members that must receive the notification.
This allows these events to trigger the Amazon SNS notification of the specific event that you want to monitor. This also triggers an email to the team specified in the email alias.
In Step 3, name your alarm and give a description.
In Step 4, you can preview your selections and create your alarm. Select the Create alarm button in the bottom right.
With this selection in place, whenever a backup copy fails to complete, the alarm triggers and a notification will be sent. CloudWatch makes it simple to create and set up alarms and notifications to alert you to events, such as in the scenario shown above.
Configure AWS Backup events to send to Amazon EventBridge
Amazon EventBridge is a central serverless event bus for AWS services that routes application data to targets. To send events to Amazon EventBridge, go to the Amazon EventBridge console and select the Create rule button.
In the first step, define your rule name in the Rules section of the console. I used ‘cwtest’ for this example.
Next, select the Event pattern radio button and then the Pre-defined by service button. Select AWS for the Service provider and then select Backup for the Service name.
I selected Backup Vault State Change from the list in this example so that I can detect when something changes in the Backup Vault.
Next, select your event bus for this rule. I selected AWS default event bus, the default in this example, and toggled Enable the rule on the selected event bus to enable it.
Afterward, select your targets. This could be a Lambda function, an SNS topic, or any number of additional options available in the drop-down menu. In our example I chose SNS topic, and for the topic, selected the Default_CloudWatch_Alarms_Topic.
When you are done select Create.
You have now created a new rule, and the events that you configured will be sent to the target in the rule.
If you have finished following along, be sure to delete any example resources you no longer need in order to avoid incurring unintended CloudWatch or EventBridge charges moving forward.
In this post, I demonstrated how to get started with CloudWatch and EventBridge integrations for AWS Backup. I covered enabling the service integrations to send events to each service and how to configure rules to send events in CloudWatch. Next, I covered configuring CloudWatch Metrics for AWS Backup and how to configure alarms. I then reviewed how to filter and find relevant events within CloudWatch Logs. Finally, I also demonstrated how to configure these events to be sent to Amazon EventBridge for the same purposes.
I hope this walkthrough gives you a better understanding of the many tools at your disposal. The new integrations enable you to easily capture logs and events created by state changes related to your backups. You can monitor these events from custom dashboards, or trigger alarms to send notifications for the events that are important to your workflows and daily operations. These new capabilities will help to better secure and monitor your backup jobs and alert you when problems occur.
Thanks for reading this post. If you have any feedback or questions, please leave them in the comments section.