AWS Cloud Operations Blog
Using AWS X-Ray and AWS Application Cost Profiler to track tenant cost of shared AWS Infrastructure
In our last blog post, we introduced AWS Application Cost Profiler (ACP), where we discussed this new service that allows customers, running multi-tenant applications, to receive granular cost breakdowns of shared AWS resources across their tenants. AWS Application Cost Profiler provides customers, especially SaaS ISVs, with a standard mechanism to correlate and report their infrastructure cost for each customer or tenant. This granular, tenant-based view of costs allow ISVs to develop go-to-market strategies with tier-based support or consumption-based pricing for their products, and effectively manage multi-tenant architecture model costs. In addition, organizations running multi-tenant applications can utilize the data to define an accurate cost allocation model for chargeback purposes.
Application Cost Profiler requires application owners to instrument their application so that tenant usage meta data can be generated and used as input for Applicatoin Cost Profiler. We previously demonstrated instrumenting a sample application utilizing Amazon CloudWatch Logs. In that example, the tenant information was added to the Amazon CloudWatch Logs output, and a scheduled AWS Lambda function would process the log output hourly in order to generate the tenant usage metadata utilized by AWS Application Cost Profiler. In the following example, you will utilize trace summaries from AWS X-Ray to generate tenant usage metadata and integrate with AWS Application Cost Profiler.
Distributed tracing with AWS X-Ray
AWS X-Ray is a distributed tracing service helping developers analyze and debug distributed applications, such as those built via a microservices architecture. These trace summaries generated by AWS X-Ray contain information about services and resources used in the request. By instrumenting your existing application that is using AWS X-Ray with tenant information, tenant usage metadata can then be generated for services like AWS Lambda, Amazon DynamoDB, Amazon Simple Notification Service (SNS), and Amazon Simple Queue Service (SQS).
Sample instrumentation using AWS X-Ray
Getting started with AWS Application Cost Profiler (ACP)
Getting started with AWS Application Cost Profiler (ACP) is a two-step process:
Step 1: Configure ACP to generate the consumption insights report.
Step 2: Instrument the application for tenant metadata.
Both steps can be completed in any order, but both must be completed before ACP can generate these insights. In order to save time for your initial set up, we created three helpful CloudFormation templates. See details in the sections below.
Step 1: Configure the Application Cost Profiler for reporting consumption insights
The Application Cost Profiler report configuration must be defined in the AWS Console, via the AWS CLI, or via one of the AWS SDKs. We will demonstrate how to configure this in the AWS console. The report configuration is to instruct Application Cost Profiler where to deliver the tenant cost reports. The destination is an Amazon S3 bucket with the proper permissions for the Application Cost Profiler to write the reports to.
Prerequisites
For this walkthrough, you will need the following:
- An AWS account
- AWS IAM user with console access and admin privileges
- Enable Cost Explorer
Setup S3 bucket and report configuration
- Log in to the AWS Console.
- Verify that Cost Explorer has been enabled (important, as AWS Application Cost Profiler will not process tenant usage data without Cost Explorer being enabled).
Click the launch stack button below to launch our first CloudFormation stack that will install and configure an S3 bucket with the proper Application Cost Profiler permissions and server side encryption settings, an AWS Event Bridge rule, and an SNS Topic for AWS Application Cost Profiler events that you can optionally subscribe to.
- Once the CloudFormation stack launch has completed, in the us-east-1 region use the console search function to navigate to the “AWS Application Cost Profiler” landing page.
- In the AWS Application Cost Profiler dashboard, click “Get started now”.
Figure 1: Sample Application cost profiler console.
- Setup a new report configuration:
- Report Name – This is user defined and cannot be changed once saved.
- Report Description – This is a user defined description of the report configuration (optional).
- S3 Bucket Name – This is the S3 bucket where AWS Application Cost Profiler will deliver the reports. This bucket was created via the previous CloudFormation template. This bucket is named “acp-{REGION}–{ACCOUNT_ID}”, substituting {REGION} with the AWS region where the CloudFormation template was deployed, e.g., us-east-1, and substituting {ACCOUNT_ID} with the actual AWS account id utilized to deploy the CloudFormation template. For example, “acp-us-east-1-987654321”. This Report Bucket name can also be found in the “Resources” section of the CloudFormation stack deployed above.
- S3 Prefix – This is the prefix in the S3 bucket used above where AWS Application Cost Profiler will deliver the reports. The S3 bucket deployed in the CloudFormation template above enabled write permissions for the AWS Application Cost Profiler to the “reports” prefix. Therefore, enter “reports” for S3 prefix here.
- Time Frequency – Choose whether the report is generated Daily, Monthly, or Both.
- Report Output Format – Choose the file type that will be created within your Amazon S3 bucket. If you choose CSV, Application Cost Profiler creates a comma-separated values text file with gzip compression for the reports. If you choose Parquet, a Parquet file is generated for the reports.
Figure 2: Configure final report page in the Application cost profiler console
- Click the “Configure” button. Application Cost Profiler will verify the existence of the bucket defined above as well as the services write permissions to the prefix defined above. If successful, you will see a confirmation.
Figure 3: Confirmation message in the Application cost profiler console
- Click “OK” to return to the AWS Application Cost Profiler.
Now that you’ve set up an S3 bucket destination with permissions as well as a report configuration within the AWS Cost Application Profiler console, you’re ready to prepare, upload, and import your tenant usage data.
Step 2: Reporting tenant usage data from your services – an example
In order to generate reports, AWS Application Cost Profiler requires you to provide tenant usage data. This information must be uploaded to an S3 location that AWS Application Cost Profiler has permissions to read from. The S3 bucket created in the first CloudFormation template above has granted AWS Application Cost Profiler read access to the “import/*” prefix.
* By giving Application Cost Profiler access to your usage data from S3 bucket, you allow Application Cost Profiler to temporarily copy such usage data objects to the US East (N. Virginia) AWS Region while processing reports. These data objects will be kept in the US East (N. Virginia) Region until the monthly report generation is complete. To avoid incurring data transfer charges, you can configure requester pays on bucket.
For example, let’s instrument a sample serverless application to track cost across your tenant base. Our second CloudFormation template will deploy a basic serverless application via Amazon API Gateway and AWS Lambda via AWS X-Ray for distributed tracing.
Figure 4: Higher level architecture diagram of a serverless application
Deploy the sample serverless application
- Log in to the AWS Console.
- Click the launch stack button below to launch the second CloudFormation stack that will install and configure the sample serverless application. This stack must be launched in the same region as the CloudFormation stack launched previously.
- Once this CloudFormation template has completed deploying, go to the Outputs stack section and note the the apiGatewayInvokeURL value. It will be in the following format:
https://q60a8f2j07.execute-api.us-east-1.amazonaws.com/TEST/ACPDemo?tenantId=123
- Using the apiGatewayInvokeURL identified above, open a web browser and paste the value into the location window. This will execute the sample serverless application simulating use by tenant “123”. Now, change the value after tenantId= in the browser location window to simulate multiple tenants, such as tenantId=10 or tenantId=20, etc., pressing enter each time to execute the sample serverless application. Invoking this endpoint in a browser will cause the lambda function to execute and generate a AWS X-Ray trace annotation containing the tenant information. Additionally, you can use utilities like JMeter to simulate hundreds or even thousands of tenants invoking this sample serverless application.
PLEASE NOTE: This serverless application is one of several approaches you can use to track tenant usage. AWS Application Cost Profiler requires the tenant usage report to be in the CSV file format only, which is the end result of this example. As mentioned, your tenant usage CSV report can be generated in many different ways, depending on your application architecture and current tenant model.
Now that the tenant id is available in the Lambda CloudWatch logs of the sample serverless application, it’s time to generate the tenant usage data file. This file must be structured in the format shown below. As a reminder, only CSV files with file names like “.csv”, “.csv.gz”, .and “csv.gzip” are supported by ACP.
Table 1: Application Tenant Usage Data Elements
Field | Description |
ApplicationId | Identifies the application or product being used in your system. Defines the tenant metadata scope. |
TenantId | An identifier in your system for the tenant consuming the specified resource. Application Cost Profiler aggregates to this level within the ApplicationId. |
TenantDesc | (Value Optional) Additional data about the tenant for your additional reporting. |
UsageAccountId | The account that the resource runs in (important for accounts within an organization). |
StartTime | Timestamp (in milliseconds) from Epoch, in UTC. Indicates the start time of the period for usage by the specified tenant. |
EndTime | Timestamp (in milliseconds) from Epoch, in UTC. Indicates the end time of the period for usage by the specified tenant. |
ResourceId | Amazon Resource Name (ARN) for resource being used. |
Name | (Optional) As an alternative to specifying a ResourceId, specify a Name resource tag to attribute costs to a resource set (the field must include the value you want to use for the Name tag). Resource tags are enabled as part of your Cost and Usage Report. For more information about resource tags, see Resource tags details in the Cost and Usage Report User Guide. |
** *ApplicationId, TenantId, TenantDesc, UsageAccountId, StartTime, EndTime, and ResourceId* are AWS Application Cost Profiler-reserved keywords and cannot be used as name tag names.
In this example we will generate the tenant usage data according to the format above by processing and reading the information from the CloudWatch logs of the lambda function we want to report on. For the instrumentation infrastructure, we will utilize the following AWS technologies/resources:
- X-Ray Traces: This provides the necessary information to generate tenant usage data files for the example Lambda. Other telemetry, observability, and logging solutions can be used as well.
- Amazon CloudWatch Events: This triggers the instrumentation lambda hourly, submits hourly tenant usage data to the AWS Application Cost Profiler.
- Lambda Function: This extracts information from CloudWatch Logs, generate the tenant usage data files, and upload the files to the S3 bucket created in the CloudFormation template deployed above.
- Identity and Access Management (IAM): The right IAM policy must be in place for the Lambda function to write to the S3 bucket.
- CloudFormation templates: We will utilize a CloudFormation template so you can easily deploy the sample function and trigger that to generate hourly tenant usage data.
Deploy the sample tenant usage generator
- Log in to the AWS Console.
- Click the launch stack button below to launch our third CloudFormation stack that will install and configure the X-Ray tenant usage generator example (ACP_Xraytracesgenerator). This stack must be launched in the same region as the CloudFormation stack launched previously. The environment variable for “lambdafunctioname” is preset to “ACP_DynamoDBSourceDataGenerator”, which is the name of the Lambda function that was deployed in the previous CloudFormation template. This is the Lambda function for which the corresponding X-Ray Traces will be analyzed for tenant usage data.
With the above stack deployed, and assuming that the original AWS Lambda function to be reported on has been run in the last hour, you should now see a new file created in the acp-{REGION}-{ACCOUNT_ID} bucket under the “imports” prefix for every hour that the ACP_DynamoDBSourceDataGenerator function executes on the schedule defined above. The file will contain tenant usage data in the csv format similar to this:
The Step Functions state machine will also invoke the ACP_SubmitTenantdata Lambda Function which informed AWS Application Cost Profiler to process the generated file during the nightly and/or monthly report generation cycle.
Application Cost Profiler Data
If the time frequency for your Application Cost Profiler (ACP) report was set up as daily, then it can take up to 24 hours to see a generated report in s3. Utilizing the above CloudFormation setup, the ACP data will be placed in S3://acp-{REGION}-{ACCOUNT_ID}/reports/YYYY/MM/DD/part-*.csv.gz accordingly. An Amazon EventBridge event will also be generated when ACP data is available in the following format:
The following data is available in the tenant cost csv file generated by ACP:
Table 2: AWS Application Cost Profiler Tenant Cost Breakdown Elements on output file
Column name | Description |
PayerAccountId | The management account ID in an organization, or the account ID if the account is not part of AWS Organizations. |
UsageAccountId | The account ID for the account with usage. |
LineItemType | The type of record. Always Usage. |
UsageStartTime | Timestamp (in milliseconds) from Epoch, in UTC. Indicates the start time of the period for usage by the specified tenant. |
UsageEndTime | Timestamp (in milliseconds) from Epoch, in UTC. Indicates the end time of the period for usage by the specified tenant. |
ApplicationIdentifier | The ApplicationId specified in the usage data sent to Application Cost Profiler. |
TenantIdentifier | The TenantId specified in the usage data sent to Application Cost Profiler. Data with no record in the usage data is collected in unattributed. |
TenantDescription | The TenantDesc specified in the usage data sent to Application Cost Profiler. |
ProductCode | The AWS product being billed (for example, AmazonEC2). |
UsageType | The type of usage being billed (for example, BoxUsage:c5.large). |
Operation | The operation being billed (for example, RunInstances). |
ResourceId | The resource ID or Amazon Resource Name (ARN) for the resource being billed. |
ScaleFactor | If a resource is over-allocated for an hour, for example if the usage data reported is equal to 2 hours instead of 1 hour, a scale factor is applied to make the total equal to the actual billed amount (in this case, 0.5). This column reports the scale factor used for the specific resource for that hour. The scale factor is always greater than zero (0) and less than or equal to 1. |
TenantAttributionPercent | The percentage of usage attributed to the specified tenant (between zero (0) and 1). |
UsageAmount | The usage amount attributed to the specified tenant. |
CurrencyCode | The currency that the rate and cost are in (for example, USD). |
Rate | The usage billing rate, per unit. |
TenantCost | The total cost for that resource for the specified tenant. |
Region | The AWS Region of the resource. |
Name | If you created resource tags for your resources on the Cost and Usage report, or through the resource usage data, then the Name tag is shown here. For more information about resource tags, see Resource tags details in the Cost and Usage Report User Guide. |
The following is an example of the actual CSV output in the ACP report.
This CSV file can then be directly queried via Amazon Athena or integrated into your existing analytics and reporting tools such as Amazon Quicksight like in this example. Customers using Parquet can also conduct the same analysis.