Managing data confidentiality for Scope 3 emissions using AWS Clean Rooms
Scope 3 emissions are indirect greenhouse gas emissions that are a result of a company’s activities, but occur outside the company’s direct control or ownership. Measuring these emissions requires collecting data from a wide range of external sources, like raw material suppliers, transportation providers, and other third parties. One of the main challenges with Scope 3 data collection is ensuring data confidentiality when sharing proprietary information between third-party suppliers. Organizations are hesitant to share information that could potentially be used by competitors. This can make it difficult for companies to accurately measure and report on their Scope 3 emissions. And the result is that it limits their ability to manage climate-related impacts and risks.
In this blog, we show how to use AWS Clean Rooms to share Scope 3 emissions data between a reporting company and two of their value chain partners (a raw material purchased goods supplier and a transportation provider). Data confidentially requirements are specified by each organization before participating in the data AWS Clean Rooms collaboration (see Figure 1).
Each account has confidential data described as follows:
- Column 1 lists the raw material Region of origin. This is business confidential information for supplier.
- Column 2 lists the emission factors at the raw material level. This is sensitive information for the supplier.
- Column 3 lists the mode of transportation. This is business confidential information for the transportation provider.
- Column 4 lists the emissions in transporting individual items. This is sensitive information for the transportation provider.
- Rows in column 5 list the product recipe at the ingredient level. This is trade secret information for the reporting company.
Overview of solution
In this architecture, AWS Clean Rooms is used to analyze and collaborate on emission datasets without sharing, moving, or revealing underlying data to collaborators (shown in Figure 2).
Three AWS accounts are used to demonstrate this approach. The Reporting Account creates a collaboration in AWS Clean Rooms and invites the Purchased Goods Account and Transportation Account to join as members. All accounts can protect their underlying data with privacy-enhancing controls to contribute data directly from Amazon Simple Storage Service (S3) using AWS Glue tables.
The Purchased Goods Account includes users who can update the purchased goods bucket. Similarly, the Transportation Account has users who can update the transportation bucket. The Reporting Account can run SQL queries on the configured tables. AWS Clean Rooms only returns results complying with the analysis rules set by all participating accounts.
For this walkthrough, you should have the following prerequisites:
- Three AWS accounts in the same AWS Region
- An Amazon S3 bucket in each account with emissions data (see Figure 1)
- An AWS Glue Data Catalog for the emissions data stored in each S3 bucket
We configured the S3 buckets for each AWS account as follows:
- Reporting Account: reportingcompany.csv
- Purchased Goods Account: purchasedgood.csv
- Transportation Account: transportation.csv
Create an AWS Glue Data Catalog for each S3 data source following the method in the Glue Data Catalog Developer Guide. The AWS Glue tables should match the schema detailed previously in Figure 1, for each respective account (see Figure 3).
Data consumers can be configured to ingest, analyze, and visualize queries (refer back to Figure 2). We will tag the Reporting Account Glue Database as “reporting-db” and the Glue Table as “reporting.” Likewise, the Purchased Goods Account will have “purchase-db” and “purchase” tags.
Additional actions are recommended to secure each account in a production environment. To configure encryption, review the Further Reading section at the end of this post, AWS Identity and Access Management (IAM) roles, and Amazon CloudWatch.
This walkthrough consists of four steps:
- The Reporting Account creates the AWS Clean Rooms collaboration and invites the Purchased Goods Account and Transportation Account to share data.
- The Purchased Goods Account and Transportation Account accepts this invitation.
- Rules are applied for each collaboration account restricting how data is shared between AWS Clean Rooms collaboration accounts.
- The SQL query is created and run in the Reporting Account.
1. Create the AWS Clean Rooms collaboration in the Reporting Account
(The steps covered in this section require you to be logged into the Reporting Account.)
- Navigate to the AWS Clean Rooms console and click Create collaboration.
- In the Details section, type “Scope 3 Clean Room Collaboration” in the Name field.
- Scroll to the Member 1 section. Enter “Reporting Account” in the Member display name field.
- In Member 2 section, enter “Purchased Goods Account” for your first collaboration member name, with their account number in the Member AWS account ID box.
- Click Add another member and add “Transportation Account” as the third collaborator with their AWS account number.
- Choose the “Reporting Account” as the Member who can query and receive result in the Member abilities section. Click Next.
- Select Yes, join by creating membership now. Click Next.
- Verify the collaboration settings on the Review and Create page, then select Create and join collaboration and create membership.
Both accounts will then receive an invitation to accept the collaboration (see Figure 4). The console reveals each member status as “Invited” until accepted. Next, we will show how the invited members apply query restrictions on their data.
2. Accept invitations and configure table collaboration rules
Steps in this section are applied to the Purchased Goods Account and Transportation Account following collaboration environment setup. For brevity, we will demonstrate steps using the Purchased Goods Account. Differences for the Transportation Account are noted.
- Log in to the AWS account owning the Purchased Goods Account and accept the collaboration invitation.
- Open the AWS Clean Rooms console and select Collaborations on the left-hand navigation pane, then click Available to join.
- You will see an invitation from the Scope 3 Clean Room Collaboration. Click on Scope 3 Clean Room Collaboration and then Create membership.
- Select Tables, then Associate table. Click Configure new table.
The next action is to associate the Glue table created from the purchasedgoods.csv file. This sequence restricts access to the origin_region column (transportation_mode for the Transportation Account table) in the collaboration.
- In the Scope 3 Clean Room Collaboration, select Configured tables in the left-hand pane, then Configure new table. Select the AWS Glue table associated with purchasedgoods.csv (shown in Figure 5).
- Select the AWS Glue Database (purchase-db) and AWS Glue Table (purchase).
- Verify the correct table section by toggling View schema from the AWS Glue slider bar.
- In the Columns allowed in collaboration section, select all fields except for origin_region. This action prevents the origin_region column being accessed and viewed in the collaboration.
- Complete this step by selecting Configure new table.
- Select Configure analysis rule (see Figure 6).
- Select Aggregation type then Next.
- Select SUM as the Aggregate function and s3_upstream_purchased_good for the column.
- Under Join controls, select Specify Join column. Select “item” from the list of options. This permits SQL join queries to execute on the “item” column. Click Next.
- The next page specifies the minimum number of unique rows to aggregate for the “join” command. Select “item” for Column name and “2” for the Minimum number of distinct values. Click Next.
- To confirm the table configuration query rules, click Configure analysis rule.
- The final step is to click Associate to collaboration and select Scope 3 Clean Room Collaboration in the pulldown menu. Select Associate table after page refresh.
The procedure in this section is repeated for the Transportation Account, with the following exceptions:
- The columns shared in this collaboration are item, s3_upstream_transportation, and unit.
- The Aggregation function is a SUM applied on the s3_upstream_transportation column.
- The item column has an Aggregation constraint minimum of two distinct values.
3. Configure table collaboration rules inside the Reporting Account
At this stage, member account tables are created and shared in the collaboration. The next step is to configure the Reporting Account tables in the Reporting Account’s AWS account.
- Navigate to AWS Clean Rooms. Select Configured tables, then Configure new table.
- Select the Glue database and table associated with the file reportingcompany.csv.
- Under Columns allowed in collaboration, select All columns, then Configure new table.
- Configure collaboration rules by clicking Configure analysis rule using the Guided workflow.
- Select Aggregation type, then Next.
- Select SUM as the Aggregate function and ingredient for the column (see Figure 7).
- Only SQL join queries can be executed on the ingredient column by selecting it in the Specify join columns section.
- In the Dimension controls, select product. This option permits grouping by product name in the SQL query. Select Next.
- Select None in the Scalar functions section. Click Next. Read more about scalar functions in the AWS Clean Rooms User Guide.
- On the next page, select ingredient for Column name and 2 for the Minimum number of distinct values. Click Next. To confirm query control submission, select Configure analysis rule on the next page.
- Validate the setting in the Review and Configure window, then select Next.
- Inside the Configured tables tab, select Associate to collaboration. Assign the table to the Scope 3 Clean Rooms Collaboration.
- Select the Scope 3 Clean Room Collaboration in the dropdown menu. Select Choose collaboration.
On the Scope 3 Clean Room Collaboration page, select reporting, then Associate table.
4. Create and run the SQL query
Queries can now be run inside the Reporting Account (shown in Figure 8).
- Select an S3 destination to output the query results. Select Action, then Set results settings.
- Enter the S3 bucket name, then click Save changes.
- Paste this SQL snippet inside the query text editor (see Figure 8):
r.product AS “Product”,
SUM(p.s3_upstream_purchased_good) AS “Scope_3_Purchased_Goods_Emissions”,
SUM(t.s3_upstream_transportation) AS “Scope_3_Transportation_Emissions”
INNER JOIN purchase p ON r.ingredient = p.item
INNER JOIN transportation t ON p.item = t.item
- Click Run query. The query results should appear after a few minutes on the initial query, but will take less time for subsequent queries.
This example shows how Clean Rooms can aggregate data across collaborators to produce total Scope 3 emissions for each product from purchased goods and transportation. This query was performed between three organizations without revealing underlying emission factors or proprietary product recipe to one another. This alleviates data confidentially concerns and improves sustainability reporting transparency.
The following steps are taken to clean up all resources created in this walkthrough:
- Member and Collaboration Accounts:
- AWS Clean Rooms: Disassociate and delete collaboration tables
- AWS Clean Rooms: Remove member account in the collaboration
- AWS Glue: Delete the crawler, database, and tables
- AWS IAM: Delete the AWS Clean Rooms service policy
- Amazon S3: Delete the CSV file storage buckets
- Collaboration Account only:
- Amazon S3: delete the SQL query bucket
- AWS Clean Rooms: delete the Scope 3 Clean Room Collaboration
- Greenhouse Gas Protocol Scope 3 Standard
- AWS Clean Rooms
- AWS Clean Rooms User Guide
- Getting started with the AWS Glue Data Catalog
- Analyzing Data in S3 using Amazon Athena