Cost allocation and tracking for AWS centralized backups

With business growth often comes an increase in data-management operations and costs, including enterprises scaling data backup solutions to adequately serve their organizational requirements. Managing backup costs is critical to overall data-management costs, and backup managers often need granular information on the components that make up their backup bill, like knowing the backup spend for different organizational departments, for tracking, insights, and optimization.

AWS customers use AWS Backup to centralize and automate backups of their workloads for security, compliance, and resilience purposes. As customers scale up their businesses, they can easily scale up their use of AWS Backup, including quickly understanding their usage and tracking costs across various parts of their organization. With AWS centralized backups and the observer solution for AWS Backup, customers can administer, orchestrate, and monitor their backup operations at an organization or enterprise level.

In this blog post, we showcase five common cost-analysis scenarios:

How to find the costs of a backup plan for an organization.
How to analyze the costs of a central backup account.
How to obtain costs for backup plans per department or summarize costs per backup plan per department within an organization.
How to find costs and allocate them to the right department in a single tenancy environment where there are no tags that provide cost center information.
How to find the organizational cost of a vault in a central account.

The approaches in this blog post heavily rely on the AWS Cost and Usage Reports (AWS CUR) as the primary mechanism of cost analysis. The information provided in this blog post can help you gain a better understanding of your backup costs and how to track and allocate them, facilitating processes like internal IT charge-back and showback.

Prerequisites

For the approaches described in this blog to work as described, the following prerequisites need to be in place:

1. Cost and Usage Reports

AWS Cost and Usage Reports tracks your AWS usage and provides estimated charges associated with your account. Each report contains line items for each unique combination of AWS products, usage type, and operation that you use in your AWS account. You can customize the AWS Cost and Usage Reports to aggregate the information either by the hour, day, or month. You can use Cost and Usage Reports to publish your AWS billing reports to an Amazon Simple Storage Service (Amazon S3) bucket that you own. You can find out more about the AWS CUR in the Cost and Usage Report User Guide.

The approaches in this blog use the AWS CUR as the source of truth for tracking and allocating AWS Backup costs across the organization. The approaches require the CUR data to be exported to an S3 bucket in the Parquet format. The export options are set to aggregate data by the hour and include resource IDs (that is, the ARN of the resource generating the cost and usage). This enables us to track costs in detail. We crawl this data using AWS Glue and make it available for Amazon Athena to query. You can find out more on how to set this up in the AWS Well-Architected Lab Level 200: Cost and Usage Analysis.

Note: Please ensure you set up the export in the Parquet format and enable Include Resource IDs. There will be costs associated with the underlying resources used to store the CUR data in the S3 bucket and also with AWS Glue and Amazon Athena based on usage. The AWS Glue crawler that was set up as part of the labs reference above will be scheduled to run regularly as and when the AWS Billing service exports the CUR data to the S3 bucket. This helps keep the CUR data and its schema up to date.

2. Backup observer solution for AWS Backup

The backup observer solution for AWS Backup is a set of automation templates and dashboards that customers can deploy in their environment to automatically obtain daily, aggregated, cross-account, and multi-Region reports for AWS Backup usage. You can find out more about the observer solution for AWS Backup on the AWS Storage Blog post Obtain aggregated daily cross-account multi-Region AWS Backup reporting.

The solution provides a set of aggregated daily job reports that are cross-account and multi-Region based. These reports are stored in a central S3 bucket, enabling customers to access historical backup reports as required. We use this dataset to analyze the use of AWS Backup across the organization. The main focus of this blog post is to join this dataset with the CUR dataset, thereby providing customers with the ability to track AWS Backup costs from the account from which it is reported to the source AWS workload resource for which the backups were created in a multi-copy multi-account scenario.

3. (Optional) Amazon QuickSight setup

Amazon QuickSight provides customers with the ability to create visualizations and dashboards. We use QuickSight to create dashboards from the analysis we perform over the Cost and Usage Reports, as well as the backup observer solution datasets. Both these solutions also provide useful out-of-the-box analysis with their own QuickSight dashboards. These QuickSight dashboards are optional and can be deployed if customers require visualizations of the analyses. To do this, customers would have to onboard and set up their Amazon QuickSight environment. You can find out more at the Amazon QuickSight User Guide.

Assumptions

The blog assumes engineers, technicians, and operators adopting the approaches and how-tos described here are familiar with Amazon Athena, AWS Glue, and Amazon QuickSight at a Foundational (100) or Intermediate (200) level. This blog also assumes familiarity with other AWS services such as Amazon S3, AWS Cost and Usage Reports, AWS Backup, and the backup observer solution.

Cost tracking and allocation scenarios

In this section, we go through the 4 scenarios mentioned in the introduction for cost tracking and insights.

Scenario 1: How much does the backup plan cost the organization?

The primary way to track costs across any large AWS environment is through the use of resource tags. Resource tags allow customers to assign metadata to AWS resources. Each tag is a label consisting of a user-defined key and value. Tags can help you manage, identify, organize, search for, and filter resources. You can create tags to categorize resources by purpose, owner, environment, or other criteria.

Tags are also used for cost tracking. Many customers already have cost tags like “CostCenter: XYU123” assigned to their AWS resources for cost-tracking purposes. We use a similar approach.

In an environment with organization-wide utilization of AWS Backup, the primary cost question asked is, “How much does backup plan X cost the organization?” To answer this question, we create a tag with the name “CreatedFromAWSBackupPlan” with value of the backup plan name. This tag is created as a property of the backup plan; thus, all backup plans within an organization can create the same tag “CreatedFromAWSBackupPlan” but with the unique name of each individual backup plan as the tag value. This allows us to query and filter the CUR dataset using the plan name.

The following step in the backup plan creation workflow highlights where to set up a tag that gets added every time a recovery point is created by the backup plan.

Step 1: Navigate to the AWS Backup console.

Step 2: Select Backup plans from the left navigation pane.

Step 3: Select Create Backup plan on the right-hand side.

Step 4: Fill in the form with details like plan name, rule configuration, and so on.

Step 4a: Under the Backup rule configuration section on this screen, expand the Tags added to recovery points

Step 5: Complete the backup pan creation workflow

Because we have enabled resource IDs in the CUR export, the CUR dataset will contain charges for each individual resource for AWS Backup. This means charges from each recovery point will be reported. To enable slicing and filtering this data based on the CreatedFromAWSBackupPlan tag, we will need to mark this tag as a cost allocation tag in the AWS Billing dashboard. This will ensure that the tag value is reported against every line item of the charge in the CUR. (The column will contain the tag value or will be blank if the tag has not been assigned to the resource). For more information, read about Resource tags details.

Note: Follow the guide on Activating User-Defined Cost Allocation Tags to enable cost allocation tags.

Once the CUR export, backup plan, and cost allocation tag are set up, AWS Billing will report this data to the configured S3 bucket, and the AWS Glue crawler will make it available to query via Amazon Athena (refer to prerequisite 1).

Go to the Amazon Athena console and navigate to the query editor to write SQL queries for analysis.

To answer the question “How much does backup plan X cost the organization?”, write the query as follows:

select resource_tags_user_created_from_backup_plan, 
SUM(line_item_unblended_cost) as "Unblended Cost" 
from athenacurcfn_usage_costs.usage_costs 
where resource_tags_user_created_from_backup_plan = 'Premium' 
group by resource_tags_user_created_from_backup_plan;

Note: Refer to Understanding your AWS Cost Datasets: A Cheat Sheet for a distinction between different types of costs, including blended and unblended costs. For most use cases, unblended costs would be the appropriate costs to use.

The result will provide the total cost across all resources and accounts in the organization that has been generated by the BackupPlanName.

To report this cost by account number:

select resource_tags_user_created_from_backup_plan, 
line_item_usage_account_id, SUM(line_item_unblended_cost) as 
"Unblended Cost" 
from athenacurcfn_usage_costs.usage_costs 
where resource_tags_user_created_from_backup_plan = 'Premium' 
group by resource_tags_user_created_from_backup_plan, 
line_item_usage_account_id;

The result will provide the total cost across all resources and accounts in the organization that has been generated by the BackupPlanName, broken down against each account in the organization.

Scenario 2: What is a particular department’s costs for their premium backup plan?

In this scenario, where an organization relies completely on tags for cost tracking, we can use a similar approach to Scenario 1.

Create a backup plan with the option to “copy resource tags to recovery point.” This ensures that when AWS Backup creates recovery points, it copies the tags from the resources and applies them to the recovery points. When copies of this recovery point are made across multiple accounts or Regions, these tags are applied to the copies as well. This ensures any cost tags that are assigned to the resource being backed up are applied to all copies of the recovery point in the organization.

Note: Please ensure these cost tags have been enabled as cost allocation tags in AWS Billing for them to be reported in AWS CUR.

Once we combine this with a new CreatedFromBackupPlan tag as described in Scenario 1, we can write a query to answer many questions about cost. Following are some examples:

To answer this question, let’s assume 3586 is the Research Department’s Cost Center. We head to the Athena console and start writing the following query in the query editor:

select resource_tags_user_created_from_backup_plan, 
resource_tags_user_cost_centre, SUM(line_item_unblended_cost) as 
"Unblended Cost" 
from athenacurcfn_usage_costs.usage_costs 
where resource_tags_user_created_from_backup_plan = 'Premium' and 
resource_tags_user_cost_centre = '3586' 
group by resource_tags_user_created_from_backup_plan, 
resource_tags_user_cost_centre;

This result will provide the total cost across all resources and accounts in the organization for the Research Department.

Scenario 3: What are the preceding department’s costs per backup plan?

To answer this question, we head to the Athena console and start writing the following query in the query editor:

select resource_tags_user_created_from_backup_plan, 
resource_tags_user_cost_centre, SUM(line_item_unblended_cost) as 
"Unblended Cost" 
from athenacurcfn_usage_costs.usage_costs 
where resource_tags_user_cost_centre = '3586' 
group by resource_tags_user_created_from_backup_plan, 
resource_tags_user_cost_centre;

This result will provide the total cost across all resources and accounts in the organization for the Research Department sliced by each backup plan

Scenario 4 (part 1): What are the preceding department’s costs for the premium plan if my environment is a tenancy-based environment (no cost center tags)?

Most customers that use AWS Backup organization-wide create multiple copies of backups that are kept cross-account or in a central vault in an account where copies of recovery points across the organization are stored (Reference architecture).

In this situation, customers want to allocate AWS Backup usage costs reported under different accounts to the user of the source resource for which the recovery point was created. This creates a challenge for customers when there are no tags that function similarly to a CostCenter tag. This generally happens in an environment where an organization’s users are configured to use single tenancy accounts. That is, charges for the entire account are owned by a particular part of the organization. In the case of accounts that function as central backup accounts, it becomes challenging to track the central account costs and divide them based on their originating accounts manually.

We can resolve this challenge by joining the backup observer solution for an AWS Backup dataset with the CUR dataset and a mapping of the organization’s account numbers to their owners. Note: This assumes both datasets are available in the same account and an account mapping is available from the customer. If not, additional work needs to be done for these datasets to be accessible from a central place. For more information, refer to the Amazon Athena User Guide.

The backup observer solution contains a copy job dataset that provides us information about which recovery point was copied from x source to y destination. We will combine this with information from the CUR that tells us the costs for a premium backup plan.

To answer this question, we assume 729421293554 is the Research Department’s single tenancy account, and 516843186143 is the central backup account. We head to the Athena console and start writing the following query in the query editor:

Select g.resource_tags_user_created_from_backup_plan, 
SUM(g."Unblended Cost") from (

select b.resource_tags_user_created_from_backup_plan, 
SUM(b.line_item_unblended_cost) as "Unblended Cost" 
from "aws_backup_logs_db"."aws_copy_logs_view" a, 
"athenacurcfn_usage_costs"."usage_costs" b 
where a.destination_recoverypoint_arn = b.line_item_resource_id and 
b.line_item_usage_account_id='516843186143' and a.account_id = '729421293554' 
and resource_tags_user_created_from_backup_plan = 'Premium' 
group by b.resource_tags_user_created_from_backup_plan
UNION
select resource_tags_user_created_from_backup_plan, 
SUM(line_item_unblended_cost) as "Unblended Cost" 
from athenacurcfn_usage_costs.usage_costs 
where resource_tags_user_created_from_backup_plan = 'Premium' and 
line_item_usage_account_id = '729421293554' 
group by resource_tags_user_created_from_backup_plan) g 

group by g.resource_tags_user_created_from_backup_plan;

This result will provide the total cost for the premium plan across the organization for the Research Department. This query can be extended if there are multiple single tenanted Research Department accounts, for example:

Select g.resource_tags_user_created_from_backup_plan, 
SUM(g."Unblended Cost") from (

select b.resource_tags_user_created_from_backup_plan, 
SUM(b.line_item_unblended_cost) as "Unblended Cost" 
from "aws_backup_logs_db"."aws_copy_logs_view" a, 
"athenacurcfn_usage_costs"."usage_costs" b 
where a.destination_recoverypoint_arn = b.line_item_resource_id and 
b.line_item_usage_account_id='516843186143' and a.account_id IN ('729421293554', '784932476112') 
and resource_tags_user_created_from_backup_plan = 'Premium' 
group by b.resource_tags_user_created_from_backup_plan

UNION

select resource_tags_user_created_from_backup_plan, 
SUM(line_item_unblended_cost) as "Unblended Cost" 
from athenacurcfn_usage_costs.usage_costs 
where resource_tags_user_created_from_backup_plan = 'Premium' and 
line_item_usage_account_id IN ('729421293554', '784932476112') 
group by resource_tags_user_created_from_backup_plan) g 

group by g.resource_tags_user_created_from_backup_plan;

Scenario 4 (part 2): What are the Research Department’s costs per backup plan if my environment is a tenancy-based environment (no cost center tags)?

To answer this question, we take the same approach as Scenario 4 and head to the Athena console, and start writing the following query in the query editor:

Select g.resource_tags_user_created_from_backup_plan, 
g.line_item_usage_account_id, SUM(g."Unblended Cost") from (

select b.resource_tags_user_created_from_backup_plan, 
b.line_item_usage_account_id, SUM(b.line_item_unblended_cost) as 
"Unblended Cost" 
from "aws_backup_logs_db"."aws_copy_logs_view" a, 
"athenacurcfn_usage_costs"."usage_costs" b 
where a.destination_recoverypoint_arn = b.line_item_resource_id and 
b.line_item_usage_account_id='516843186143' and a.account_id IN ('729421293554', '784932476112') group by b.resource_tags_user_created_from_backup_plan, b.line_item_usage_account_id

UNION

select resource_tags_user_created_from_backup_plan, 
line_item_usage_account_id, SUM(line_item_unblended_cost) as 
"Unblended Cost" from athenacurcfn_usage_costs.usage_costs 
where line_item_usage_account_id IN ('729421293554', '784932476112') 
group by resource_tags_user_created_from_backup_plan, line_item_usage_account_id) g 

group by g.resource_tags_user_created_from_backup_plan, g.line_item_usage_account_id;

This result will provide the total cost for all plans across the organization for the Research Department.

Scenario 5: How much does vault X cost the organization in a central account?

To answer this question, we take the same approach as Scenario 4 and head to the Athena console and start writing the following query in the query editor:

select SUM(b.line_item_unblended_cost) as "Unblended Cost" 
FROM "aws_backup_logs_db"."aws_copy_logs_view" a, 
"athenacurcfn_usage_costs"."usage_costs" b 
where a.destination_recoverypoint_arn = b.line_item_resource_id and 
a.destination_backupvault_arn = 'arn:aws:backup:ap-southeast-
2:516843186143:backup-vault:backup-blog-cost-central-vault'

This result will provide the total cost for all recovery points stored in vault X.

Similarly, queries can be created to slice information in the above scenarios by AWS Region, resource type, and so on.

As these approaches demonstrate, once the information is set up in a way that is available in the CUR, customers can perform complex analyses to answer their business queries. Customers have the opportunity to build on top of this approach by converting these query results into Athena views and have them connected to Amazon QuickSight or use QuickSight Q directly over these datasets. There, they can build visualizations like graphs and charts that allow them to answer these questions to their nontechnical, management, or executive audience.

Cleaning up

You can delete the resources deployed by this solution, to avoid incurring future charges. Backup plans, recovery points and the backup vault can be deleted using the AWS management console. You can delete resources deployed by observer solution for AWS Backup by following the “cleaning up” section of the blog post.

Conclusion

In this blog post, we introduced the various approaches that can be used to analyze AWS Backup costs for different types of enterprise environments. We provided an overview of how the approaches rely on the AWS Cost and Usage Reports, and shared how the observer solution for AWS Backup can be leveraged alongside them to provide detailed tracking and analysis in complex multi-copy enterprise AWS Backup environments.

You can use these approaches to build comprehensive cost analysis and visibility behaviors within your organization. You can further build on top, link, and automate into organizational finance processes to meet your own organizational requirements. You can use Amazon QuickSight to build visualizations that bring your executive audience closer and gives them a deeper understanding of AWS costs.

To get started on AWS or to learn more about building a well-architected AWS environment, visit the getting started with AWS Backup page for guidance.

Thank you for reading this blog. If you have any feedback or questions, feel free leave them in the comments section.