AWS Big Data Blog

Access Amazon Redshift data from Salesforce Data Cloud with Zero Copy Data Federation

This post is co-authored by Vijay Gopalakrishnan, Director of Product, Salesforce Data Cloud.

In today’s data-driven business landscape, organizations collect a wealth of data across various touch points and unify it in a central data warehouse or a data lake to deliver business insights. This data is primarily used for analytical and machine learning purposes, but not easily accessible by the business users across Sales, Service, and Marketing teams to make data driven decisions. Salesforce and Amazon collaborated to address this challenge, by making the data accessible to the users in the flow of their work, with Zero Copy Data Federation between Salesforce Data Cloud and Amazon Redshift. This solution empowers businesses to access Redshift data within the Salesforce Data Cloud, breaking down data silos, gaining deeper insights, and creating unified customer profiles to deliver highly personalized experiences across various touchpoints. By eliminating the need for data replication, this integration improves efficiency and reduces costs while enabling real-time access to valuable business data.

In this post, we explore the benefits of the new Zero Copy Data Federation and provide a step-by-step guidance to configure it in Salesforce Data Cloud.

What is Salesforce Data Cloud?

Salesforce Data Cloud is a data platform that unifies all of your company’s data into Salesforce’s Einstein 1 Platform, giving every team a 360-degree view of the customer to drive automation, create analytics, personalize engagement, and power trusted artificial intelligence (AI). Data Cloud creates a holistic customer view by turning volumes of disconnected data into a unified customer profile that’s straightforward to access and understand. This includes diverse datasets like telemetry data, web engagement data, and more across your organization or your external data lakes and warehouses. This unified view helps your Sales, Service, and Marketing teams build personalized customer experiences, invoke data-driven actions and workflows, and safely drive AI across all your Salesforce apps.

What is Amazon Redshift?

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence (BI) tools. It’s optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and delivers better price-performance compared to most traditional data warehousing solutions. With a fully managed AI powered massively parallel processing (MPP) architecture, Amazon Redshift makes business decision-making quick and cost-effective.

What is Zero Copy Data Federation?

Zero Copy Data Federation, a Salesforce Data Cloud capability, unifies Salesforce and Amazon Redshift data through a point-and-click interface. It provides secure, real-time access to Redshift data without copying, keeping enterprise data in place. This eliminates replication overhead and ensures access to current information, enhancing data integration while maintaining data integrity and efficiency.

Data federated from Amazon Redshift is represented as a native data cloud object which power various Data Cloud features, including marketing segmentation, activations, and process automation. With these capabilities at your fingertips, you can enrich unified customer profile in Salesforce Data Cloud with transaction data from Amazon Redshift to create a rich customer 360, gain insights, harness predictive and generative AI on the unified data, and ultimately deliver highly personalized experiences across multiple touchpoints.

The following diagram depicts Zero Copy Data Federation flow, key features enabled and few potential actions and activations.

solution architecture

Connection to Amazon Redshift is established by deploying a data stream in Salesforce Data Cloud. When you deploy a data stream from Amazon Redshift to Data Cloud, an external data lake object (DLO) is created within the Data Cloud environment. This external DLO acts as a storage container, housing metadata for your federated Redshift data. Importantly, the DLO serves as a reference, pointing to the data physically stored in your Redshift data warehouse, keeping your data in its original location. Similar to native DLOs, the Amazon Redshift backed external DLOs can power several key features, including batch transform, calculated insights, identity resolution, query, segmentation, and activation, among others. Customer unified profiles enriched with Redshift data could be actioned by Amazon SageMaker to drive predictive outcomes and activated across several platforms, including Amazon Ads and Salesforce Marketing Cloud, for creating audience journeys and running targeted campaigns.

To increase performance, you can opt for acceleration, which is designed to enhance query runtimes. For more information on this feature, refer to Acceleration in Data Federation.

To summarize, Zero Copy Data Federation provides the following benefits:

  • Unified data view: Integrates external data seamlessly with Salesforce data for a comprehensive customer view.
  • Real-time access: Provides near real-time access to data stored in external sources like Amazon Redshift.
  • Data efficiency: Eliminates the need to copy or move large datasets, reducing storage costs and data duplication.
  • Cost-effective: Reduces data transfer pipeline and storage costs associated with traditional data integration methods.
  • Enhanced security: Data remains in its original secure environment, reducing exposure risks.
  • Streamlined compliance: Simplifies data governance by maintaining data in its original, regulated environment.

Prerequisites

Before configuring data federation, you must have access to Salesforce Data Cloud and the information to connect to your Redshift provisioned or serverless warehouse. The Redshift warehouse must be publicly accessible and it is recommended to restrict access by allow listing only the Data Cloud IP addresses.

For information on setting up an Amazon Redshift Serverless or Amazon Redshift provisioned cluster, refer to Amazon Redshift Serverless or Amazon Redshift provisioned clusters, respectively.

Configure Zero Copy Data Federation

To federate Redshift data to Salesforce Data Cloud, start by configuring a Redshift connection.

  1. Log in to Salesforce Data Cloud and navigate to Data Cloud Setup.
    Step 1 - Navigate to Data Cloud Setup
  2. In the navigation pane, choose Connectors under Configuration.
    Step 2 choose Connectors under Configuration.
  3. Choose New, choose Amazon Redshift, and choose Next.
    Step 3 choose New, choose Amazon Redshift, and choose Next.
  4. Retrieve the Redshift endpoint by navigating to the Redshift Serverless or provisioned cluster in the AWS console. Following image shows how to obtain the endpoint URL for Redshift serverless.
    Step 4 Retrieve the Redshift endpoint
  5. Back in Salesforce Data Cloud, configure the connector with a unique name and enter the endpoint from your Redshift server.
  6. Enter the user name and password configured for your Redshift serverless namespace.
  7. Enter the name of the database configured in your Redshift serverless namespace.
    Configure the coonector
  8. Choose Test Connection to confirm you’re able to successfully connect to the Redshift instance and choose Save.
    Confirm connection and Save

Create a Redshift Zero Copy Data Federation data stream

Complete the following steps to create a data stream using the connection you created:

  1. Navigate to Data Cloud and choose Data Streams in the navigation bar.
  2. Choose New to set up a new data stream.
    set up a new data stream
  3. Choose Amazon Redshift and choose Next.
    Select Amazon Redshift
  4. Choose your connector, database, and objects, then choose Next.
    Choose your connector, database, and objects, then choose Next.
  5. Configure the object, category, primary key, and fields:
    1. Set the object name and object API name. For more information, see Data Lake Object Naming Standards.
    2. Set the category to specify the type of data to ingest. For more information, see Category.
    3. Set the primary key to identify the incoming records uniquely. For more information, see Primary Key.
    4. Select the source fields you want to ingest.
  6. Choose Next.
    Configure the object, category, primary key, and fields. And choose Next
  7. Select the relevant data space. Choose default if you don’t have any other data space provisioned in your organization. For more information, see Manage Data Spaces.
  8. If you want to query the data in your Redshift instance with reduced latency, select Enable acceleration and choose your acceleration schedule. For more information, see Acceleration in Data Federation.
  9. Choose Deploy.
    deploy

On successful deployment, a data stream is created.

On successful deployment, a data stream is created.

Use cases for Zero Copy Data Federation

The following are key use cases enabled by Zero Copy Data Federation between Redshift and Salesforce Data Cloud:

  • Marketing insurance campaign journey – Combine customer profile, insurance policy, and plan data in Amazon Redshift with customer data in Salesforce Cloud for targeted outreach campaigns in Marketing Cloud. This facilitates cross-selling of other financial products.
  • Targeted promotions and customer outreach – Merge customer purchase and profile data from Amazon Redshift with customer feedback and service data in Salesforce for targeted customer outreach in Marketing Cloud, including promotional deals.
  • Customer satisfaction using service cloud data – Combine customer and case data in Salesforce with customer feedback data in Amazon Redshift to determine customer satisfaction ratings, enhancing service quality.
  • Prioritized offers and data-driven next-best actions – Utilize customer billing accounts and service data from Salesforce along with prospect, order, and billing data in Amazon Redshift to generate prioritized offers and next-best actions. The transition from ETL pipelines to Zero Copy BYOL integration has streamlined operations.
  • Customer segmentation and activation – Federate purchase data and billing history from Amazon Redshift to enrich unified profiles in Salesforce Data Cloud and generate actionable insights based on the recency, frequency, and monetary value to create customer segments and activate to your desired source.
  • Customer 360 with rich insights – Enrich customer profiles in Salesforce Data Cloud with purchase, billing, and product data from Amazon Redshift to empower Marketing, Sales, and Service teams to improve customer engagement with rich customer insights.

Conclusion

Zero Copy Data Federation between Salesforce Data Cloud and Amazon Redshift empowers businesses to break down data silos, enhance customer experiences, and drive operational efficiencies. By federating Redshift data to Salesforce Data Cloud, organizations can make informed decisions faster, personalize customer interactions at scale, and optimize resources across marketing, sales, service, and operations. This integration sets a new standard for data-driven business success in the digital age. Check out the Salesforce Zero Copy Data Federation announcement and the following resources to learn more and get started:


About the Authors

Vijay Gopalakrishnan is a Director of Product Management with Salesforce with several years of experience in the data space. He currently is a part of the Salesforce Data Cloud team.

Ravi Bhattiprolu is a Sr. Partner Solutions Architect at AWS. Ravi works with strategic ISV partners, Salesforce and Tableau, to deliver innovative and well-architected products and solutions that help joint customers achieve their business and technical objectives.

Avijit Goswami is a Principal Solutions Architect at AWS specialized in data and analytics. He supports AWS strategic customers in building high-performing, secure, and scalable data lake solutions on AWS using AWS managed services and open-source solutions. Outside of his work, Avijit likes to travel, hike, watch sports, and listen to music.

Ife Stewart is a Principal Solutions Architect in the Strategic ISV segment at AWS. She has been engaged with Salesforce Data Cloud over the last 2 years to help build integrated customer experiences across Salesforce and AWS. Ife has over 10 years of experience in technology. She is an advocate for diversity and inclusion in the technology field.

Mike Patterson is a Senior Customer Solutions Manager in the Strategic ISV segment at AWS. He has partnered with Salesforce Data Cloud to align business objectives with innovative AWS solutions to achieve impactful customer experiences. In Mike’s spare time, he enjoys spending time with his family, sports, and outdoor activities.