AWS Quantum Technologies Blog

Tracking quantum experiments with Amazon Braket Hybrid Jobs and Amazon SageMaker Experiments

Hybrid quantum-classical algorithms are promising applications of near-term quantum devices with limited number of qubits and imperfect quantum gates. Similar to machine learning (ML) models, these heuristic quantum algorithms are trained iteratively. Generally, the free parameters of the algorithm (or model) need to be fine-tuned over several trial runs – a process called hyperparameter optimization in ML.

This approach comes with challenges around reproducibility and experiment management. Luckily, we can draw from best practices in ML that have been developed over the last 20 years.

In this blog post, we show how you can track, manage and organize hybrid quantum-classical experiments with the Hybrid Jobs feature in Amazon Braket and Amazon SageMaker Experiments, used by ML developers on AWS to organize their experiments and trials.

Hybrid quantum algorithms

Hybrid quantum algorithms combine quantum computation with classical compute resources. They apply to a variety of problems ranging from combinatorial optimization to quantum machine learning. The idea is to use a quantum algorithm to solve parts of the problem that are notoriously hard on a classical computer. But instead of using fixed parameters for the quantum algorithm, we let a classical algorithm “learn” the optimal parameters. As such, it is closely related to approaches used in machine learning where you use an algorithm to decide on optimal model parameters. An example of a hybrid quantum algorithm that we use in this blog is the variational quantum classifier (VQC), where data points are encoded in a quantum state and a parametrized class of measurements or operations are used to separate them into the different classes.

Hybrid algorithms are often integrated in a larger end-to-end workflow, with steps such as data pre- and post-processing, and they may depend on hyperparameters such as the depth of quantum circuits or the learning rate. Finding the best parameters for each step is vital to optimize metrics such as accuracy or run-time and often requires extensive experimentation. Tracking the preparation, training and evaluation of these experiments can quickly become cumbersome and slow down development.

This problem of experiment tracking is well-known in the machine learning community and developers on AWS can use SageMaker Experiments to address it. The goal of SageMaker Experiments is to make it as simple as possible to run systematic machine learning experiments and run analytics across the artifacts and metadata they generate. In this blog post, we show how SageMaker Experiments can also be used to tackle the following three main challenges of running quantum experiments:

  1. It is hard to systematically organize large numbers of experiments and keep track of the adjustments that actually improved the algorithm. With SageMaker Experiments, you can organize an experiment as a collection of trials, where each trial corresponds to one particular adjustment made to the algorithm, such as a specific value of the learning rate. You can compare, evaluate and save them as a group.
  2. Each trial may include many parameter settings and artifacts that need to be tracked to achieve reproducibility of experiments. With SageMaker Experiments you can keep track of all the required information. You can associate a trial component to each stage in a trial, such as data pre-processing or hyperparameter tuning. A trial component can include a combination of inputs such as datasets, algorithms, and parameters, and produce specific outputs such as metrics, datasets, and checkpoints.
  3. Collaborating on experiments can be hard. With SageMaker Experiments you can share all details of your experiments with others or even let multiple people collaborate on the same experiment.

In the remainder of this blog, we show these features for a specific example of a hybrid quantum experiment.


The workflow to optimize quantum experiments is very similar to a machine learning workflow. We demonstrate the iterative task of fine-tuning the hyperparameters of a simple hybrid quantum algorithm using SageMaker Experiments. We train a variational quantum classifier (VQC) on a variant of the classic iris data set. The code is based on the iris classification example in the VQC tutorial by Pennylane and adapted to run using Braket Hybrid Jobs.

Braket Hybrid Jobs already enables users to train classifiers with different hyperparameters as demonstrated in the amazon-braket-examples repo on Github. However, jobs associated to different sets of hyperparameters are all logged independently and can, for instance, be accessed through the Braket console. SageMaker Experiments provides a way for you to organize your experiments in a structured manner with trials and trial components, customize the information you want to log, and compare and analyze different runs. This enables you to obtain and reproduce the best results quickly and efficiently (see Figure 1).

Figure 1: Key metrics and parameters of an experiment consisting of three trials. Each trial corresponds to one run of the algorithm with a different number of layers.

Figure 1: Key metrics and parameters of an experiment consisting of three trials. Each trial corresponds to one run of the algorithm with a different number of layers.


For this example, you should have the following prerequisites:

We’ve structured our walkthrough as follows:

  • Step 1: Setting up the environment for experiment tracking
  • Step 2: Problem example and pre-processing
  • Step 3: Running the experiment and tracking it with Sagemaker Experiments
  • Step 4: Evaluation of the experiment using Sagemaker Experiments
  • Step 5: Cleaning up

You can find the full example in GitHub.

Step 1: Setting up the environment for experiment tracking

The example notebook has to run on a Braket Notebook instance that has access to SageMaker Experiments functionality. The following steps show how you can set up a SageMaker notebook with the required permissions:

  1. Create an S3 bucket and assign it a name that starts with amazon-braket-...
  2. Go to the Braket console and create a Notebook in your preferred region, where Braket is available. Instance size ml.t3.medium with standard settings is sufficient for the example. Once this Notebook instance has started it will also be accessible from the SageMaker console.
  3. Select your new Braket Notebook instance in the SageMaker console and open its settings.
  4. Go to “Permissions and Encryption” and click on “IAM Role ARN
  5. Click on “Add Permission” then “Create Inline Policy“.
  6. Copy and paste the following policy and give it a name.
    "Version": "2012-10-17",
    "Statement": [
            "Sid": "searchpermissions",
            "Effect": "Allow",
            "Action": [
            "Resource": "*"
            "Sid": "experimentpermissions",
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:sagemaker:*:*:*amazon-braket-*"
            "Sid": "trialpermissions",
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:sagemaker:*:*:*amazon-braket-*"
            "Sid": "trialcomponentpermissions",
            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:sagemaker:*:*:experiment-*"

Following the best practice of not giving wildcard (`*`) permissions, we only allow to create, delete and manipulate Amazon SageMaker experiments and trials whose names start with amazon-braket. The only exception is the read-only sagemaker:Search, for which we must specify all resources (`*`)  to which the policy applies in the Resource section of the policy statement. More details are provided in the SageMaker documentation. We have to keep these naming conventions in mind when we create the resource that follow, and also when we create experiments and trials below.

After you have completed the setup above, access the Notebook instance, download the repository containing the example notebook, start the example notebook “iris-quantum-experiment-tracking.ipynb” with the conda_braket  kernel and install the requirements:

In the example notebook they can be installed by running pip install -r requirements.txt

Step 2: Problem example and pre-processing

For demonstration purposes we use the simple iris classification example by pennylane and the data and algorithm that is provided in there. In this blog we focus on the experiment tracking, and we refer to the tutorial for more details on the algorithm itself. The goal is to determine from a two-dimensional feature set a dichotomic value that is represented by the color in the graph below:

Figure 2: The distribution of classes in feature space. The color/shape of the data points indicates the class.

Figure 2: The distribution of classes in feature space. The color/shape of the data points indicates the class.

In the following sections we show some of the key steps performed, but please refer to the example notebook for details. Note also that we use a simulator instead of a real quantum processing unit (QPU) for this demo. Please refer to the documentation for more information on the different QPU devices available in Braket, and their pricing.

Running and tracking the experiment with Sagemaker Experiment

To begin tracking our experiment we need to create the overarching experiment and define the experiment name and description:

experiment_name = f"amazon-braket-my-first-vqc-experiment-{'%Y-%m-%d-%H-%M-%S')}"

vqc_experiment = Experiment.create(
    description="Tracking my first VQC experiment using Amazon Sagemaker Experiments",

Before we can apply the VQC to the data, a pre-processing step is required in which the feature set X is translated into single qubit rotations. We also split the sample into a training and a validation set and denote them by feats_train and feats_val. For experiment tracking, we store the raw and pre-processed data in an Amazon S3 bucket and save the path location in the parameters raw_data_path and processed_data_path for later use. Please refer to the example notebook for details. Following machine learning best practices, we track all the parameters used in the data pre-processing such as training and validation size train_size and val_size, and the seed. We can decide later what we need to display, however, backfilling this data after running our experiments is cumbersome and often impossible.

We have now realized two of the main benefits of experiment tracking that we mentioned in the beginning. We achieve full reproducibility of the results and consistent tracking of all iterations. All resources relevant to the experiment are tracked. If you ever need to rerun an experiment or get some additional data, this feature facilitates reproducibility.

We can use the SageMaker Experiments Tracker class to achieve this, which we import via from smexperiments.tracker import Tracker. Note that this class is not related to the Amazon Braket Tracker class used for cost tracking.

with Tracker.create(display_name="pre-processing") as preproc_tracker:
   {"train_size": train_size, "val_size": val_size, "seed": seed}
    preproc_tracker.log_input(name="raw data", media_type="s3/uri", value=raw_data_path)
        name="preprocessed data", media_type="s3/uri", value=processed_data_path

Next, we train the VQC and perform hyperparameter optimization. Our goal is to explore how the number of layers in the VQC affects the classification performance.

For our current experiment we run trials for different values of num_layers. We run a Braket Hybrid Job for each selection of num_layers. Each run constitutes one trial of our experiment. We track all hyperparameters and the corresponding job ARN. As in the preprocessing step, this enables us to fully reproduce our trials at a later point and investigate the artifacts that they created. We will later use this data to access all metrics and datasets of each trial.

We start by defining the hyperparameters that are constant over all trials:

fixed_hyperparameters = {
    "seed": seed,
    "stepsize": 0.015,
    "num_iterations": 30,
    "batchsize": 5,

For each trial, and thus, num_layers we first add the trial to the experiment. We use the current datetime to distinguish the trial and avoid reoccurring trial names:

trial = Trial.create(

For each trial, we can then put all hyperparameters, the fixed_hyperparameters and those specific to the trial, into a dictionary called hyperparameters, and track them:

with Tracker.create(
 ) as hyperparameters_tracker:

Next, we create a Braket Hybrid Job using the VQC and track the job ARN used to identify the results of the different trials:

job = AwsQuantumJob.create(

# Track the job_arn
with Tracker.create(
) as job_arn_tracker:
    job_arn_tracker.log_output("job_arn", job.arn)

Finally, we add all the tracked information to the trial:


Running the previous lines for different numbers of layers num_layers collects all the required information about the experiments, including the results, in order to collectively analyze them. It might take a few minutes until the quantum job is completed and you can proceed with the evaluation of the experiments.

Note also that storing this metadata is handled by SageMaker Experiments internally. This incurs costs after the Amazon SageMaker Free Tier is exceeded.

Step 4: Evaluation of the experiment using SageMaker Experiments

You can use the SageMaker client to search through all of our past experiments. The search can be tailored using search expressions (see here for more details). We search for all experiments containing amazon-braket-my-first-vqc-experiment in their name.

search_expression = {
  "Filters": [
      "Name": "DisplayName",
      "Operator": "Contains",
      "Value": "amazon-braket-my-first-vqc-experiment"
    }]}"Experiment", SearchExpression=search_expression)

Depending on how many experiments you have run, this may return a long list of entries like the one below, where our current experiment is one entry.

{'Results': [{'Experiment': {'ExperimentName': 'amazon-braket-my-first-vqc-experiment-2023-03-05-19-22-49',
    'ExperimentArn': 'arn:aws:sagemaker:us-east-1:ACCOUNT_ID:experiment/amazon-braket-my-first-vqc-experiment-2023-03-05-19-22-49',
    'DisplayName': 'amazon-braket-my-first-vqc-experiment-2023-03-05-19-22-49',
    'Description': 'Tracking my first VQC experiment using Amazon Sagemaker Experiments',
    'CreationTime': datetime.datetime(2023, 3, 5, 19, 22, 49, tzinfo=tzlocal()),
    'CreatedBy': {},
    'LastModifiedTime': datetime.datetime(2023, 3, 5, 19, 22, 49, tzinfo=tzlocal()),
    'LastModifiedBy': {}}}]}

Note that you can go back to your experiment details at a later point. We can use the higher-level functionality of ExperimentAnalytics from the module to get more information on our last experiment in dataframe format.

trial_component_analytics = ExperimentAnalytics(
    sagemaker_session=Session(sess, sm),

df_exp = trial_component_analytics.dataframe()

This DataFrame contains all the metadata stored in this experiment.

Figure 3 A Pandas DataFrame showing all the information collected during the experiment.

Figure 3 A Pandas DataFrame showing all the information collected during the experiment.

We can also filter this dataframe for all the information we collected during pre-processing:

Figure 4: A pandas DataFrame showing all the information collected during the pre-processing step of a trial.

Figure 4: A Pandas DataFrame showing all the information collected during the pre-processing step of a trial.

It can make sense to write custom functions that extract specific information from df_exp. We create our own function that takes df_exp as input and computes and compares key metrics about the model performance for the different trials. We provide the details of the function in the example notebook. There are two main steps:

First, we calculate the metrics for the different trials by using the one-to-one correspondence between the Braket Hybrid Job ARN and the trials to load up all the Braket Hybrid Jobs:

TRIAL_TO_JOB = {trials[0]: AwsQuantumJob(arn) for _, trials, arn in df_exp[df_exp.DisplayName == "job-arn"][["Trials", "job_arn - Value"]].to_records()}

Then we extract performance metrics from the reloaded job object and merge them with the hyperparameters tracked as trial components. After some formatting, we get the Pandas DataFrame in Figure 1.

From the data in Figure 1, we can see that taking four layers yields the best final validation accuracy. However, looking only at the final validation accuracy can be deceiving. Let’s dive deeper and investigate how the validation accuracy evolved with the optimization iterations within each trial. Again, this information can be accessed using ExperimentAnalytics (see the example notebook for details).

Figure 5: Change of the validation accuracy in the number of iterations for three different trials with increasing numbers of layers.

Figure 5: Change of the validation accuracy in the number of iterations for three different trials with increasing numbers of layers.

In Figure 5 we can see that, with a few exceptions, the validation accuracy for four layers is the highest throughout training. This suggests that four layers is indeed the best of all trialed choices. However, we can also see that the validation accuracy for all choices of num_layers is still increasing. This suggests that we might want to run another set of trials with a larger number of iterations (num_layers) to confirm the trend and potentially increase the performance of the VQC.

Step 5: Cleaning up

Follow these steps to avoid any future costs:

  • Run vqc_experiment.delete_all(action="--force") to delete all trials and components associated to the experiment created above
  • delete all datasets created and stored as part of this notebook
  • stop and delete your notebook instance


In this blog post we showed how you can use SageMaker Experiments on Braket to apply machine learning best practices to quantum computing research and boost your team’s productivity. Amazon SageMaker Experiments enables you to organize your experiments, achieve reproducibility, and collaborate on them effectively as a team. To learn more about how Braket Hybrid Jobs helps you run hybrid quantum-classical algorithms visit the user guide. Note also that there are many more features and learnings from machine learning that can be directly applied to quantum computing. Stay tuned and happy experimenting!