AWS Quantum Technologies Blog

Introducing a cost control solution for Amazon Braket

Effective cost management is a key requirement for customers who run quantum experiments on Amazon Braket, the quantum computing service of AWS. Customers want the ability to monitor the costs of their quantum computing resource consumption as they incur. They want the ability to set budgets and be notified when budget limits are reached. Some customers, often academics with a fixed allocation of research grants, even want to prevent costs from exceeding budget limits. Others, particularly those managing teams of Braket users, want the ability to break down costs by device type or user identity.

Several AWS cost management tools and Braket cost tracking features are currently available for these purposes. We’ve previously blogged about these features and services in another blog post, together with a review of the Braket pricing.

In this blog post, we’ll introduce you to an Amazon Braket cost control solution, we open-sourced on GitHub under an MIT license. The solution complements the cost management tools we mentioned in that earlier post.

Background

The Braket Cost Tracker (part of the Braket SDK) is a convenient tool for near real-time cost tracking and control from within a quantum program, only requiring a few additional lines of code. However, it only tracks costs in the session of the quantum program and can’t be enabled from a single place to record the costs of all quantum tasks created in an AWS account.

AWS Budgets is a service for cost and usage tracking, budgeting, and forecasting at the AWS account level. It can alert you or run actions on your behalf when a budget exceeds a certain threshold. However, AWS updates the data Budgets feeds on typically only every 8-12 hours. Quantum workloads, in particular hybrid classical-quantum computations can create many quantum tasks in a short timeframe and you can exceed a tight budget before AWS Budgets has time to create an alert, or an action becomes effective.

The solution we’ll show you here lets you to monitor and control quantum task costs in all Braket regions (at the AWS account level) in near real-time. It records the estimated costs of on-demand simulator and quantum processing unit (QPU) tasks created either individually or in an Amazon Braket Hybrid Jobs execution. It aggregates the task costs every month starting from the time of your first deployment. And it also breaks down monthly costs – by device type and user identity.

You can visualize this data in a dashboard, like in Figure 1. You can configure monthly and all-time budget limits and have email notifications sent when you exceed these limits. And you can automatically revoke permissions for the creation of new quantum tasks for specific user identities.

Figure 1 - Amazon Braket cost control solution dashboard showing cost and operational metrics and alarms. The top row displays alarm statuses. Gauge and time-series widgets display the monthly and all-time cost aggregates together with your budget limits. The bottom row shows widgets of quantum task costs per day, user identity, and device type.

Figure 1 – Amazon Braket cost control solution dashboard showing cost and operational metrics and alarms. The top row displays alarm statuses. Gauge and time-series widgets display the monthly and all-time cost aggregates together with your budget limits. The bottom row shows widgets of quantum task costs per day, user identity, and device type.

Use the Braket Cost Tracker if you wish to track the costs for a given quantum program or hybrid job.

Use the Braket cost control solution:

  • When you are an individual Braket user with your own AWS account and you want conveniently track the costs of all your quantum tasks from a single place.
  • When you are an administrator of an AWS account with multiple Braket users and you want to monitor the costs of your users’ quantum tasks and optionally revoke your users’ permissions for the creation of new tasks when your budget is reached.

Solution overview

The solution tracks all Braket quantum tasks in your account by using several AWS services to process multiple events in near-real time. Braket delivers events about task status changes to Amazon EventBridge. Events about AWS API activity – for example when you create a task – are logged by AWS CloudTrail and delivered to EventBridge, too. These events are sent to the default event bus in the AWS region the task is created in, which in turn depends on the region where the Braket device executing the task is located (see Amazon Braket Regions and endpoints).

The Braket cost control solution first aggregates both types of events in a single place and then processes them to record task metadata and the identity of the user who created the task. Let’s explore the functionality and the components of the solution as depicted in Figure 2.

Figure 2 - Architecture diagram of the Amazon Braket cost control solution.

Figure 2 – Architecture diagram of the Amazon Braket cost control solution.

  • EventBridge rules (1) deployed in each Braket region are used to collect the relevant events and send them to a single custom EventBridge event bus (2) in the primary deployment region of the solution.
  • An EventBridge rule (3) consumes the events from the custom event bus and invokes the “quantum task logger” Lambda function (4).
  • The “quantum task logger function evaluates the cost expected for each QPU task which entered the state RUNNING and for each COMPLETED on-demand-simulator task using the functionality provided by the Braket Cost Tracker. Estimated cost, task metadata, and the user identity who created the task are stored in the “tasks” table in DynamoDB (5) with the task ARN used as the primary key. A task record is complete only when both event types have been processed. Since the events which trigger the task-logger function can be delivered out of time and more than once, conditional expressions ensure that each record is completed exactly once.
  • A DynamoDB stream (6) captures the completed record in the tasks table and invokes the “cost metering” Lambda function (7) which aggregates the task costs per month and since initial deployment of the solution. The cost metering function stores aggregated costs in the “cost summary” DynamoDB table (8) and records cost data in CloudWatch metrics (9). After the cost metering function has processed the task record, we don’t need the record anymore for the solution to work. To save on storage in the tasks table, each record has a configurable time-to-live assigned and we remove it by DynamoDB TTL after it expires.
  • CloudWatch alarms (10) watch the cost metrics. Configurable alarm thresholds represent the budget limits. On threshold crossing, alarm actions publish email notifications to an Amazon Simple Notification Service (SNS) topic (11). An EventBridge rule (12) triggers on alarm state changes and invokes the “cost control” Lambda function (13) which attaches an AWS IAM policy to the optionally configured IAM identities (14). This policy explicitly denies braket:CreateQuantumTask actions, which prevents additional task costs to incur if a cost metric is in ALARM state. After all alarms changed back to normal (either because budgets have been adjusted or because a new month started), the policy is detached from the affected identities such that they can create new quantum tasks again.
  • Cost metrics and alarms as well as operational metrics of solution components are displayed in an CloudWatch dashboard.

The services used are chargeable and the solution will generate low costs in your account. Charges for using the monitoring and alerting functionalities in CloudWatch will amount to a few dollars per month. Compute and storage for processing and recording information about your quantum tasks will amount to a few cents for every 1000 tasks. And parts of these costs might even fit in the AWS Free Tier.

Setup and installation

We implemented the solution as an AWS CDK app in Python. You can specify the configuration parameters in the CDK context. Installing and setting up this solution in your AWS account only requires a few simple steps.

If you don’t have an AWS account, or you are an AWS user but haven’t used Braket before, you can follow steps 1 – 4 of this tutorial. After you have completed this setup, you should have: 1) an AWS account; 2) an AWS access key ID and a secret access key; 3) the Braket service and third-party QPUs enabled; and 4) the AWS Command Line Interface (CLI) installed and configured to allow programmatic access to AWS resources.

Verify that you have permissions to create a CloudFormation stack and the resources defined in this solution. Also, make sure to install the CDK Toolkit, Python 3, and the Docker CLI.

Next, clone the solution repository from GitHub:

$ git clone git@github.com:aws-samples/cost-control-for-amazon-braket.git

To create a virtual environment for the Python project, make sure that there is a python3 executable in your path with access to the venv package and execute:

$ python3 -m venv .venv

After the initialization process completes you can activate your virtual environment with:

$ source .venv/bin/activate

Now you can install the required dependencies:

$ pip3 install -r requirements.txt

Next, edit the file cdk.json and update the following context parameters:

  • awsAccountId: Your 12-digit AWS account ID.
  • notificationEmailAddress: Your email address alarm notifications will be sent to.
  • allTimeCostLimit: Your quantum task budget limit since initial deployment in USD.
  • monthlyCostLimit: Your quantum task budget limit per month in USD.

Now bootstrap your AWS environment for the deployment of the CDK app:

$ python3 bootstrap.py

Finally, deploy the solution with:

$ cdk deploy --all

After the deployment completes you can access the solution dashboard. Sign in to the AWS Management Console and open the Amazon CloudWatch console. Select the Dashboards menu in the navigation pane on the left and find a custom dashboard named AmazonBraketCostControl (see Figure 3). Click on the name to access the dashboard. Note that it takes several minutes for the alarm states to be evaluated correctly and that most widgets will not have data to display until you start creating quantum tasks.

Figure 3 - The Dashboards menu in the Amazon CloudWatch console.

Figure 3 – The Dashboards menu in the Amazon CloudWatch console.

You are now set. The costs for all quantum tasks you will now create in your AWS account will be recorded and you will be notified when the costs reach the budget limits you have defined. When you want to change a configuration parameter or adapt other functionality after the initial deployment, you can simply re-deploy the solution with cdk deploy --all. Updates won’t affect recorded cost data in DynamoDB and CloudWatch.

How to prevent task costs from exceeding budget limits

An optional feature of the cost control solution is to help you prevent task costs from exceeding your budget limits by automatically revoking user permissions for creation of new quantum tasks. There are several configuration parameters that control this feature, like iamRoleNamesToControl, iamGroupNamesToControl, and iamUserNamesToControl in the file cdk.json. By default, the values of these parameters are empty (“iamRoleNamesToControl”: [], “iamGroupNamesToControl”: [], “iamUserNamesToControl”: []), meaning you’ll be only notified when budget limits are reached.

Now, let’s suppose you are an admin for the AWS account 111111111111, and your email address is admin@example.com. You have created an IAM group named braket-users containing several individual users who work on a project with Braket. Another user jane-doe experiments with Braket on a different project. All users interact with Braket through the SDK from their local environments and through managed notebook instances which have an execution role named braket-users-notebook-execution-role. To execute Braket Hybrid Jobs, they use the execution role named braket-users-job-execution-role. You want to prevent these users from collectively spending more than $5000 in total and $1000 per month on quantum tasks. In this case, the configuration parameters in cdk.json should look like:

{
  ...
  “context”: {
    ...
    “awsAccountId”: “111111111111”,
    “notificationEmailAddress”: “admin@example.com”,
    “allTimeCostLimit”: “5000”,
    “monthlyCostLimit”: “1000”,
    “iamRoleNamesToControl”: [
        “braket-users-notebook-execution-role”, 
        “braket-users-job-execution-role”
    ],
    “iamGroupNamesToControl”: [“braket-users”],
    “iamUserNamesToControl”: [“jane-doe”],
    ...
  }
}

As soon as one of the two budget limits is reached, the solution will attach a policy to the four IAM identities which explicitly denies their permission to create new quantum tasks. After the policy becomes effective, any quantum tasks created by jane-doe, users in the group braket-users, and users of notebook instances with the execution role braket-users-notebook-execution-role will receive an AccessDenied exception. Tasks that are already queued or running won’t be affected. Braket Hybrid Jobs created with the execution role braket-users-job-execution-role will continue to execute but can’t create new quantum tasks either. Depending on how the job script handles the exception, the job may fail or abort in a controlled way. Note that only the creation of tasks is affected, not the creation of jobs. Users can still create new jobs to execute hybrid workloads which do not create tasks but – for example – use embedded simulators.

Although the solution processes events in near real-time, actions need some time to propagate. Therefore, it is possible that a few new quantum tasks are created after a budget limit is exceeded and before permissions are revoked. If you need to stay within hard budget limits, you may want to mitigate this by configuring the solution with slightly lower limits.

Conclusion and next steps

In this blog post, we introduced an open-source, Amazon Braket cost control solution which helps you monitor and control the cost of your quantum tasks in Braket. Note that we didn’t take into account the costs for managed notebook and hybrid jobs instances. You can test the solution by executing a quantum experiment on QPUs or managed simulators, for instance by running the script create_quantum_tasks.py.

As an individual researcher or an account administrator, you can use this solution to record and aggregate the costs of quantum tasks created in your AWS account without requiring further actions. Furthermore, you can set alerts and even revoke permissions from users to prevent them from creating additional quantum tasks once your budget is met.

You can easily adapt the solution to your specific requirements e.g., to aggregate cost by different time units or other criteria. If you are an administrator of a multi-account environment managed by AWS Organizations and you wish to aggregate Braket task cost data across AWS accounts, you would have to send the relevant events emitted in the AWS accounts of interest to the custom event bus of this solution as described in the Amazon EventBridge user guide.

Give this solution a try and provide us any feedback or raise a GitHub issue or pull request with suggestions on how to further improve it to meet your needs.