AWS for Industries

Privacy-Enhanced Cross-Media Measurement: How Fifty5Blue (formerly Kantar Media) Leveraged AWS Clean Rooms to Establish Audit Transparency During Panel Data Exchange

As digital advertising continues to evolve, cross-media measurement has become a tool for advertisers seeking to understand the reach and impact of their campaigns across various platforms. To enable it, Fifty5Blue collaborated with the industry as part of the Origin cross-media measurement initiative to create a Virtual People Model. This model is based on Fifty5Blue’s representative panel of media consumers that is used for model training and ensuring comprehensive data insights across different media outlets. Until recently Fifty5blue was trading under the Kantar Media brand.

In one of the key steps of this process, Fifty5Blue had to enrich their panel with data from participating media publishers who collect advertising impression logs on their platforms. Two critical business requirements were established:

  1. Fifty5Blue needed to keep user membership in the media panel private from media publishers but at the same time receive impression logs from media publishers for the panelists.
  2. Fifty5Blue requested impression logs only for their private media panelists who had explicitly consented to Fifty5Blue for data sharing.

To address the business requirements, Fifty5Blue turned to AWS Clean Rooms to enable private set intersection (PSI), allowing Fifty5Blue to collaborate with publishers on their collective datasets—all without sharing or copying one another’s underlying data. Additionally, through careful configuration of Amazon Simple Storage Service (Amazon S3) bucket policies and AWS Key Management Service (AWS KMS), the implementation creates an audit trail accessible to third-party auditors while maintaining data privacy. Fifty5Blue collaborated with Meta in a pilot to test this usage of AWS Clean Rooms.

Before diving into the solution, let’s revisit the key challenges faced by Fifty5Blue grouped into three main categories that they were looking to see if AWS Clean Rooms could address.

Data Protection Challenges:

  • Fifty5Blue can only request for and receive from publishers impression data for their private media panel users who’ve given explicit consent to Fifty5Blue to participate.
  • Fifty5Blue cannot reveal their private media panelists’ identities.
  • Enable a way to conduct secure matching of users across datasets.

Audit Requirements:

  • Audit trail should be maintained to have verifiable proof that Fifty5Blue only requested data for users who have provided explicit consent to be part of Fifty5Blue’s private media panelists.
  • External auditor verification can be performed without revealing (except to the auditors) Fifty5Blue’s private media panel user membership.

Cost and Scalability Concerns:

  • Previously used solutions for PSI, which were based on Homomorphic encryption, were expensive to operate and difficult to troubleshoot. It could take weeks to complete the setup for one panel data exchange. Additionally, it takes at least 10-12 hours to run one day’s worth of exchange.
  • Fifty5Blue needed to easily and cost-effectively scale up data analysis across several markets.

Overview of the solution

Fifty5Blue, in a pilot with Meta, deployed a solution leveraging AWS Clean Rooms as a way to address the above challenges:

  • PSI through AWS Clean Rooms
    • AWS Clean Rooms facilitates secure matching of users across Fifty5Blue and publisher datasets.
    • Only data for common, consented users is made available for analysis within an AWS Clean Rooms collaboration.
    • Neither party gains visibility into the other’s source dataset.
  • Audit-Ready Infrastructure through Amazon S3, AWS KMS, and AWS CloudTrail
    • Publisher-owned Amazon S3 bucket stores Fifty5Blue’s encrypted data.
    • Fifty5Blue-owned AWS KMS keys encrypt the data stored in publisher-owned S3 bucket, preventing publisher’s access.
    • Amazon S3 bucket policies enforce encryption and access controls.
    • AWS CloudTrail logging and Amazon S3 versioning create a verifiable audit trail.
    • Third-party auditors can access encrypted data and logs to verify compliance.

The core technical aspects of this implementation:

  1. Fifty5Blue’s data stored in the publisher-owned Amazon S3 bucket is safeguarded by AWS KMS keys that Fifty5Blue owns and controls within their account. It ensures that collaborators cannot decrypt Fifty5Blue’s raw data, regardless of whose account the data resides in. Yet, the data owner can control and log all access. This approach allows strict access control and data privacy.
  2. The solution enables auditing to all participating parties while preserving data confidentiality. Publisher-owned Amazon S3 bucket has versioning enabled and serves as the input source for Fifty5Blue’s data and as audit trail. This setup allows for a unique auditing process where neutral third-party auditors can review Fifty5Blue’s data in the publisher’s bucket without revealing to the publisher Fifty5Blue’s private media panel user membership. This is made possible because auditors are granted access to Fifty5Blue’s AWS KMS keys and the publisher’s Amazon S3 bucket. The publisher can verify bucket integrity through logs without accessing the encrypted content.
  3. Through AWS Clean Rooms, analysis can be run and charged within Fifty5Blue’s account even though the data resides in the publisher’s account. AWS Clean Rooms analysis rules allow both Fifty5Blue and the publisher to precisely define permitted analysis types, ensuring data governance and maintaining privacy. This arrangement not only enables efficient cost allocation based on usage but also allows Fifty5Blue to leverage its existing AWS infrastructure and expertise in managing secure data collaborations.

The solution ensures data security throughout the process:

  • Hash-based Message Authentication Code (HMAC) is used for personally identifiable information (PII) fields before uploading to Amazon S3
  • Advanced Encryption Standard (AES) encryption is applied to impression data
  • PSI for joint analysis

Using this AWS Clean Rooms-based solution for a pilot with Meta confirmed to Fifty5Blue that they can complete integration and securely perform data analyses significantly faster than PSI based on the Homomorphic Encryption1. Fifty5Blue is advocating all existing and new publishers adopt this AWS Clean Rooms-based implementation to ensure data protection, as well as easy set up and ongoing management

Solution Walkthrough and Architecture

AWS Clean Rooms leverages privacy-enhancing computation techniques to perform PSI. This allows Fifty5Blue and the publishers to identify common users across their datasets without exposing raw data. The process works as follows:

  • Both parties associate their hashed user identifiers to the AWS Clean Rooms collaboration. User identifiers and data are hashed outside of the AWS environment so hashing salt is never stored inside AWS.
  • AWS Clean Rooms performs the intersection computation by running the pre-agreed analysis rules by both parties.
  • Only matched records are made available for subsequent analysis
  • Neither party can access the other party’s source data, see the full list of users, or determine which specific users were not matched.

These PSI capabilities ensure that only data from users who have separately consented to both the publisher and Fifty5Blue’s data sharing is included in the analysis.

Before setting up the below environment, the following prerequisites are recommended:

Creating the audit environment

Fifty5Blue panel exchange architecture

  1. The publisher creates an Amazon S3 bucket which is shared with Fifty5Blue. The objects/data in this bucket are encrypted using an AWS KMS key that Fifty5Blue creates and owns. Additionally, the publisher defines a S3 bucket policy to mandate the use of the Fifty5Blue-created KMS key, which only Fifty5Blue has access to, for all objects stored in the bucket. In this scenario, Fifty5Blue writes encrypted hashed panelist data into the shared bucket. A sample IAM policy (in json) that is applied on publisher-owned-S3 bucket that is shared with Fifty5Blue.
  2. The publisher associates encrypted event data along with encrypted panelists data to the AWS Clean Rooms collaboration environment. Fifty5Blue created this AWS Clean Rooms collaboration and invites the publisher to join. Next, the publisher defines an analysis rule that determines the type of analysis that Fifty5Blue can run, such as conversion analysis at an aggregated level. There are multiple analysis types that can be selected including running custom SQL queries
  3. AWS Clean Rooms will need to decrypt the data for running the pre-agreed queries on the joint data sets. AWS Clean Rooms assumes the IAM role that Fifty5Blue has created and this role contains the KMS policy to decrypt the data.
  4. The AWS Clean Rooms query results are returned to the S3 bucket that Fifty5Blue owns so Fifty5Blue can perform the post-processing and analysis on the matched data.
  5. The encrypted matched-keys for the panelists are stored in the S3 bucket for 1 year. This bucket will have versioning enabled to track all the objects that are being written or deleted. In addition, AWS CloudTrail will be configured to log all the data events for this bucket to identify any modifications. When external auditors need the access to verify the data, Fifty5Blue will grant KMS access to auditors’ AWS account, and the publisher will grant read access to the S3 bucket and share CloudTrail data event logs as necessary.

Summary

AWS Clean Rooms enabled Fifty5Blue and media publishers to protect user data and privacy, easily set up and automate data pipelines, use fine-grained privacy control to govern how data is analyzed and output, incorporate match logic, and enable accountability via audits at analysis run time.

This allows Fifty5Blue and media publishers to conduct cross-media measurement at scale, enabling Terabytes of daily event data analysis while ensuring that each party can only access their data. Using AWS Clean Rooms has created a scalable mechanism for Fifty5Blue to feed its post exchange data refinement pipelines for individualization, sample selection, weighting, and data modelling. This approach offers a viable solution for addressing privacy preservation, audit, cost, and scalability; it provides a way for secure, efficient, and privacy-preserving data collaboration in the digital advertising ecosystem.

If you are interested in adopting AWS Clean Rooms technology, contact our team of experts here or get started using the service today.

1Data from Meta, 2024. WFA Cross Media Measurements pilots (UK market). Analysis of engineering time required for integration and time needed to fully analyze one day worth of data using Homomorphic-based solution and AWS Clean Rooms-based solution.

Ryan Malecky

Ryan Malecky

Ryan Malecky is a Senior Solutions Architect at Amazon Web Services. He is focused on helping customers build and gain insights from their data, especially with AWS Clean Rooms.

James Kane

James Kane

James Kane is a Senior Solutions Architect at AWS, based in London. He has been working in the cloud industry for 12 years, focusing on infrastructure, networking, security and governance enabling data and analytics. James is your customer advisor for AWS services and solutions, providing thought leadership on AWS best practices and strategic solutions.

Shaila Mathias

Shaila Mathias

Shaila Mathias is a Principal Business Development Manager at Amazon Web Services. She works with AWS Customers and Partners to build and scale privacy-enhanced offerings across industries that unlock the value of data using SQL, Artificial Intelligence (AI) and/or Machine Learning (ML) while maintaining standards of data privacy and security.