AWS Backup anomaly detection for Amazon EBS volumes
Protecting your data from cyberattacks and ransomware is a critical responsibility, and taking the necessary steps to detect anomalous activity at every level within your organization can help you keep your data as safe as possible. Data storage is an important area where you can and should deploy anomaly detection.
To protect your storage, in this post I build a simple serverless pipeline to detect anomalies occurring on Amazon Elastic Block Store (EBS) volumes. I use AWS Backup along with several other AWS managed services to build the solution.
At the start of the solution is AWS Backup. When an Amazon EBS volume is backed up, a snapshot is created. By comparing this current snapshot to a previously created snapshot, the number of changed blocks between the two can easily be determined. This changed-blocks value is published to Amazon CloudWatch as a custom metric, where an Amazon CloudWatch alarm is configured to detect anomalies on the metric. By using the powerful built-in machine learning capabilities of Amazon CloudWatch, anomalies are detected and surfaced when the alarm’s threshold band is breached. When this happens, any pre-configured alarm notifications are triggered.
To build the anomaly-detection pipeline, I use the following services:
- AWS Backup: We’ll use AWS Backup to manage backups of Amazon EBS volumes. For this post, we’ll create a backup plan which selects EBS volumes based on a tag key you provide during setup. Once a backup starts, a snapshot is created and events from the AWS Backup process are published to Amazon EventBridge.
- Amazon EventBridge: Amazon EventBridge is a serverless event bus, and it receives incoming events published from AWS Backup. For backup events that match a preconfigured rule that we create, EventBridge will trigger an AWS Lambda function.
- AWS Lambda: All AWS Backup events that match the Amazon EventBridge rule are passed to an AWS Lambda From the event, the AWS Lambda function extracts the Amazon EBS volume Amazon Resource Name (ARN) and the current snapshot name. Using the ARN as a key, the previous backup snapshot details are retrieved from an Amazon DynamoDB table. If a previous snapshot item is found for the given ARN, the current and previous snapshots are compared for change. The calculated change value is passed along to Amazon CloudWatch as a custom metric. If a previous snapshot item was not found in the Amazon DynamoDB table, a new item is created, and an Amazon CloudWatch alarm is configured for the ARN.
- Amazon DynamoDB: For each backup event arriving at the AWS Lambda function, an Amazon DynamoDB table is used to persist details about the event. The ARN from the event is used as the partition key, and the snapshot name is attached as an attribute to the item.
- Amazon CloudWatch: A custom metric is created in the AWS Lambda function and is sent to Amazon CloudWatch. The metric value contains the number of changed blocks between the current snapshot and the previous. A CloudWatch alarm, that was previously configured and created in the AWS Lambda function, is triggered when the anomaly-detection threshold band is breached.
- Amazon Simple Notification Service (Amazon SNS): When a CloudWatch alarm is triggered, an Amazon SNS topic delivers a notification to the email address configured during setup.
The following diagram illustrates the architecture of the solution:
Step 1: Download and save the AWS CloudFormation template to your local computer.
Step 2: Download and save the anomaly-detection-lambda.zip Lambda code to your local computer.
Step 3: After you’ve logged into the AWS Management Console, create an Amazon S3 bucket with a name you choose. You’ll need to create this S3 bucket in the same region you intend to deploy the CloudFormation template. After you create the bucket, upload the “anomaly-detection-lambda.zip” file to the newly created bucket. This screenshot shows an Amazon S3 bucket named “ebs-anomaly-detection-bucket” with the zipped Lambda code uploaded to it.
Step 4: Create a new CloudFormation stack by using the AWS Management Console from the AWS CloudFormation dashboard. Choose Upload a template file as the Template source and Choose file “anomaly-detection-cfn.json” – the file you downloaded in Step 1. Select Next.
Step 5: For the CloudFormation stack details, you’ll be asked to provide four parameters during deployment:
- AnomalyDetectionEmail: This email address will receive Amazon SNS notifications and will be used to configure the Amazon SNS topic. During the AWS CloudFormation deployment, you’ll receive an email at this address, with subject “AWS Notification – Subscription Confirmation”, asking you to confirm your subscription by clicking a link within the email.
- AnomalyDetectionTagKey: This tag key will identify Amazon EBS volumes to include for anomaly detection. Any EBS volume with this tag key defined (tag value ignored) will be selected by the AWS Backup Plan created during the AWS CloudFormation setup.
- BackupAnomalyDetectionS3Bucket: This is the name of the S3 bucket you created in Step 3. As a reminder, this S3 bucket needs to be in the same region where you’re currently deploying this CloudFormation template.
- BackupAnomalyDetectionS3Code: This is the name of the ZIP file you uploaded to the S3 bucket in Step 3 and contains the Lambda code.
Click the Next button on this screen and also on the following screen (configure stack options).
Step 6: Scroll to the bottom of the review screen, check the I acknowledge that AWS CloudFormation might create IAM resources, and then click the Create stack button to create the CloudFormation stack.
Once you’ve finished launching the stack, everything will be deployed to your AWS account, and then you’ll be ready to take a look around.
After you’ve logged into the AWS Management Console, navigate to the AWS Backup dashboard to look at the resources created during the AWS CloudFormation deployment. In the left navigation pane, click on the Backup vaults link. You should see something similar to what’s shown in this figure, with a newly created vault called AnomalyDetectionAnomalyDetectionBackupPlanVault.
This backup vault is a container that stores and organizes backups associated with the Amazon EBS volumes that match the tag key you entered for the AnomalyDetectionTagKey AWS CloudFormation parameter.
Next, select the Backup plans link in the left navigation pane to access the backup plan created during the deployment.
Select the AnomalyDetectionBackupPlan link under “Backup plan name” to access detailed information.
The default CloudFormation creates daily and monthly backup rules with 1-year retention. If this default doesn’t match your requirements, feel free to delete/add new rules as you see fit.
The Selection resource assignment selects EBS resources matching the tag key parameter you entered during deployment. Like backup rules, if the default resource assignment doesn’t fit your needs, you can delete and assign a new one.
As referenced in the following screenshot, go to the Amazon EventBridge dashboard and select the Rules link in the left navigation pane. The rule created during the AWS CloudFormation deployment will be found under the default event bus and will be named something similar to what you see here.
Select the rule to see the Event pattern defined for it. This event pattern is used to match incoming events published from AWS Backup. In this pattern, you’ll notice that all completed Amazon EBS backups will be matched.
Select the Targets tab to see the associated AWS Lambda function that’s triggered when the rule matches an incoming event.
Select the Target Name link to access the AWS Lambda function
Selecting the Target Name link directs you to the AWS Lambda function.
While the name of your AWS Lambda function will be slightly different, you’ll see the same configuration identifying Amazon EventBridge as the trigger for invoking the function. Explore the code to see how snapshots are compared and how integrations work with both Amazon DynamoDB and Amazon CloudWatch.
The path through the AWS Lambda function is to first check to see if there has already been a backup for the Amazon EBS volume. If this is the first backup, an Amazon CloudWatch alarm, with its name based off the EBS volume ARN, is created. The alarm is created with anomaly detection activated, and a threshold breach is triggered from an Amazon CloudWatch custom metric, also created within this same AWS Lambda function.
If this isn’t the EBS volume’s first backup, the current snapshot is compared against the previous snapshot. To compare the snapshots, the EBS API ListChangedBlocks is invoked to calculate the total number of changed blocks existing between the snapshots. The total number of changed blocks is then added to an Amazon CloudWatch custom metric. Again, this metric will be used to trigger the Amazon CloudWatch alarm that was initially created upon the EBS volume’s first backup.
Time to see the solution in action.
To see the solution in action, you’ll need to create some backups for an Amazon EBS volume. If you don’t mind waiting, the AWS Backup plan will kick off the first daily backups tomorrow. But if you want to see the pipeline in action right now, you can navigate over to the AWS Backup dashboard, and within the On-demand backup section, select the Create on-demand backup button.
At this point, you’ll be presented with a screen allowing you to create a backup.
For the Resource type, select EBS. For the Volume ID, select an existing EBS volume to backup. For the purpose of this walkthrough, selecting a smaller volume will result in a quicker backup time. If you don’t have an existing EBS volume, the simplest way to create one is to simply launch an Amazon Elastic Compute Cloud (Amazon EC2) instance. Once the instance is launched, its Volume ID will show up in the list after the screen has been refreshed.
After selecting these two items, click the Create on-demand backup button at the bottom of the screen to start the backup. The backup will start, and you’ll be notified of its status. Initially, the backup will be listed as Created and then will eventually move to Completed. You’ll need to refresh the backup jobs screen to get the updated status. Once in the Completed state, AWS Backup will publish an event to Amazon EventBridge, and the pipeline will be off and running!
Now that you’ve started your first backup, it’s time to take a look in Amazon CloudWatch to see what’s happening. Navigate to the Amazon CloudWatch dashboard and select the All alarms link in the left navigation pane. It might take some time for the AWS Backup event to travel completely through the pipeline, but soon you’ll see a new alarm with a name similar to what’s shown here.
The alarm was created by the AWS Lambda function we deployed, and it’s configured to detect anomalies based on the Amazon CloudWatch custom metric that will be sent during subsequent backups.
If you click on the newly created alarm link, you’ll be taken to a screen which shows the alarm’s details.
At this point, we don’t have any metric data, so the alarm displays insufficient data. In fact, for the Amazon CloudWatch anomaly detection model to be properly trained, we’ll need to go through quite a few backups iterations before a pattern can be established.
Jumping ahead on those additional backups, here’s what you can expect to see once the model has been adequately trained.
Checking now, a substantial number of backups have taken place, and you’ll notice a gray band stretching across the display. This is the anomaly-detection band, and when current and previous snapshots are compared and have too many changed blocks, the grey-band threshold is broken and the alarm is triggered.
Finally, let’s take a look at the actual Amazon CloudWatch custom metric. If you click on the View in metrics button in the upper-right section of the graph, you’ll be directed to the metrics page, where you can see all metric data points and also the metric and its anomaly-detection configuration.
After you’re finished, delete the resources you’ve created to avoid future charges. This can be easily done by going to AWS CloudFormation within your AWS Management Console and deleting the stack that you originally deployed.
Detecting anomalies within your data is an important way to stay ahead of cyberattacks and ransomware. In this blog post, I walked through building a simple serverless anomaly-detection pipeline using native services in AWS to identify anomalies within Amazon EBS Volumes during backup. By using the powerful built-in machine learning capabilities of Amazon CloudWatch, anomalous activity is surfaced, and you’re alerted when snapshot backup sizes breach a threshold band. This allows you to keep an ever-watchful eye on your important data.
To learn more about the technologies used to create this solution, explore the following pages: