Overview

Data Connectors for AWS Clean Rooms makes it easy for customers to bring data from marketing platforms into a guided workflow and prepare it to be used in a clean room. After selecting from the AWS solutions library, you can deploy Data Connectors for AWS Clean Rooms to launch a user interface that includes AWS Glue DataBrew. This helps customers select application sources for advertising and marketing data, normalize and map the data, and generate metadata catalogs required for a collaboration in AWS Clean Rooms.
Benefits

Follows the AWS Well-Architected Framework for extracting, transforming, and loading customer data into an AWS Glue Data Catalog for data collaboration.
Follows security best practices for protecting sensitive data at rest in Amazon S3 and metadata in AWS Glue Catalog, and identifying and helping the customer safely remove sensitive information from a dataset.
Includes prebuilt connectors with built-in data transformation to accelerate first-party usage for collaboration.
Designed for either on-demand or periodic batch operation.
Designed as a managed integration service to off-load the responsibility of monitoring changes to application end points.
Technical details

There are two variations of the Data Connectors for AWS Clean Rooms solution:
1. Amazon AppFlow for Salesforce Marketing Cloud
2. Amazon S3 Push
The two solution variations differ in how data arrives into the Amazon S3 inbound data bucket. The Salesforce Marketing Cloud variation pulls data through Amazon AppFlow. The S3 Push variation expects data to arrive from an external source into the inbound data bucket.
-
AppFlow for Salesforce Marketing Cloud
-
Amazon S3 Push
-
AppFlow for Salesforce Marketing Cloud
-
Step 1
Amazon EventBridge schedules a job that starts the Step Function state machine which includes Salesforce Appflow launch.
Step 2
The AppFlow state machine invokes the AWS Lambda function to refresh the token used by the AppFlow connection for the Salesforce connection.
Step 3
The Lambda function uses the OpenID Connect (OIDC) credentials stored in AWS Secrets Manager to retrieve a new token from Salesforce Marketing Cloud and updates the Amazon AppFlow Connections with that token.
Step 4
The AppFlow flow for Salesforce starts by opening the connection to the external data provider and requesting data.
Step 5
The external provider responds with data to AppFlow.Step 6
AppFlow puts the data into the inbound bucket and specified prefix.Step 7
AppFlow uses the Amazon-managed KMS key to encrypt objects written to S3.Step 8
S3 Push Countdown Trigger Step Functions receives notifications from Amazon S3 as the solution stores objects in the bucket. When no more objects are received after a given time, the data transformation Step Functions starts and continues with the common flow.Step 9
The data transformation Step Functions starts by invoking a Lambda function.Step 10
A Lambda function stores the task token for the Step Functions execution ID in the Amazon DynamoDB table and invokes the DataBrew job.Step 11
The DataBrew job runs, reads data from the inbound bucket, and transforms it according to the project definitions by the user.Step 12
DataBrew writes the transformed data to the transformed data bucket. AWS Glue Catalog metadata is also written out. The AWS KMS Customer Managed Key (CMK) created by this stack is used to encrypt the bucket contents and the AWS Glue metadata.Step 13
The DataBrew job completes and the associated Lambda function receives an event, looks up the task token from DynamoDB, and continues the Step Functions invocation.Step 14
The completion of the Step Functions sends a Simple Notification Service (SNS) topic message to the email subscription provided during stack creation. This is a standard-formatted Amazon SNS message.Step 15
Access logs for both inbound and transformed data buckets are placed into this bucket for later review if needed. This bucket has no lifecycle policy and all logs’ data placed here will remain until the bucket is removed. -
Amazon S3 Push
-
Step 1
Google Cloud Platform (BigQuery), Adobe Experience Platform, or another external service is configured with AWS Access Keys that have permissions to put data to the inbound data bucket and optionally use AWS KMS keys for objects encryption. A request to send the data on-demand or at an interval is specified at the external service.Step 2
Data from the external provider is pushed into the inbound bucket and specified prefix.Step 3
The S3 Push Countdown Trigger Step Functions receives notifications from Amazon S3 as the solution stores objects in the bucket. When no more objects are received after a given time, the data transformation Step Functions starts and continues with the common flow.Step 4
The data transformation Step Functions starts by invoking a Lambda function.Step 5
The Lambda function stores the task token for the Step Functions execution ID in the DynamoDB table and invokes the DataBrew job.Step 6
The DataBrew job runs, reads data from the inbound bucket, and transforms it according to the project definitions by the user.Step 7
DataBrew writes the transformed data to the transformed data bucket. Glue Catalog metadata is also written out. The AWS KMS CMK created by this stack is used to encrypt the bucket contents and the AWS Glue metadata.Step 8
The DataBrew job completes and the associated Lambda function receives an event, looks up the task token from DynamoDB, and continues the Step Functions invocation.Step 9
The completion of the Step Functions sends an Amazon SNS topic message to the email subscription provided during stack creation. This is a standard-formatted Amazon SNS message.
Step 10
Access logs for both inbound and transformed data buckets are placed into this bucket for later review if needed. This bucket has no lifecycle policy and all logs’ data placed here will remain until the bucket is removed.
Related content

This Guidance demonstrates how to provision data for collaboration using AWS Clean Rooms.
This Guidance demonstrates how to import data from an Adobe Experience Platform (AEP) to AWS Clean Rooms.
This guidance demonstrates how to ingest Google Analytics v4 data into AWS for marketing analytics.
Disclaimer
Salesforce, the Salesforce logo, and any other Salesforce trademark are trademarks of Salesforce.com, Inc., and are used here with permission.
Adobe, the Adobe logo, Acrobat, the Adobe PDF logo, Adobe Premiere, Creative Cloud, InDesign, and Photoshop are either registered trademarks or trademarks of Adobe in the United States.
Google Cloud Platform is a trademark owned by Google LLC.