AWS Entity Resolution FAQs

General

AWS Entity Resolution is a new service that helps you match, link, and enhance your related records stored across multiple applications, channels, and data stores. You can get started in minutes using easy-to-configure entity resolution workflows that are flexible, scalable, and seamlessly connectable to your existing applications. AWS Entity Resolution offers advanced matching techniques, such as rule-based, machine learning (ML) model-powered, and data service provider matching to help you more accurately link related sets of customer information, product codes, or business data codes. For example, you can use AWS Entity Resolution to create a unified view of customer interactions by linking recent events (such as ad clicks, cart abandonment, and purchases) into a unique match ID. You can also better track products that use different codes (for example, SKUs and UPCs) across your stores. With AWS Entity Resolution, you can improve matching accuracy and better protect data security while minimizing data movement.

Customers across multiple industries such as advertising, healthcare and life sciences, supply chain, and financial services who have records stored across multiple applications, channels, and data stores can benefit from AWS Entity Resolution. You can save months of development time and ongoing maintenance costs that building bespoke solutions entails by setting up AWS Entity Resolution in minutes to link, match, and deduplicate related records. In just a few steps, you can configure AWS Entity Resolution workflows and optimize advanced matching techniques, based on your business needs, to improve the quality of your data. With advanced controls, such as data encryption, data regionalization, and data hashing, AWS Entity Resolution can also help you meet data governance and security standards so that you can better protect and maintain control of your data.

AWS Entity Resolution only works with structured data.

A matching workflow in AWS Entity Resolution is a sequence of steps you set up using the AWS Management Console or command line interface (CLI). You specify the input records to be used for matching, the matching techniques to be applied on the records, and the output S3 location of the match results. You can set up one or more matching workflows to compare different data inputs, and you can use different matching techniques, such as rule-based, machine learning (ML) model-powered, and data service provider matching. You can also view the job status of existing matching workflows and metrics such as number of matches found.

A record is defined as a row of data that can have multiple columns representing input data such as source ID, first name, last name, email address, phone number, product code, business name, and so on. For example, one row of data may have 2 columns or 20 columns, but it still counts as one record. A record is processed in AWS Entity Resolution when a rule-based, ML-powered, or data service provider matching workflow is invoked to determine if that record can be matched to or enhanced with other records.

AWS Entity Resolution generates a match ID for all the records processed by the service for rule and ML based matching. Two or more records with the same match ID show that they are related. The service also shows the level of confidence with a match ID for multiple records through a rule number (for rule-based matching) and a confidence score (ML-based matching.  AWS Entity Resolution can also help you match and enhance your records with datasets and IDs from data service providers in a few clicks to better understand, reach, and engage your customers.

The matching workflows can be used sequentially, and in any order, to meet your needs. You can use the output of a rule-based or ML-powered matching workflow as an input for data service provider matching, or vice-versa to meet your specific goals. For example, you can run rule-based matching to find matches across your data sources on your own records first, and if a subset was not matched, you can then run data service provider matching to find additional matches.

This matching workflow enables you to use a set of ready-to-use rules to find matches, based on your input fields. You can also customize the rules (such as adding or removing input fields for each rule), delete rules, rearrange the priority of rules, and create new rules. For example, if you have a retail use-case, you can input product dispatch information from different suppliers to find common stock keeping units (SKUs) between them and improve your supply chain efficiency. With this matching workflow, you can also set up automated incremental processing, so that as soon as new data is available AWS Entity Resolution reads those new records and compares them against existing ones to help you keep your matches up to date while only paying for incremental processing. 

AWS Entity Resolution is powered by an advanced ML model that reviews and matches records using data input fields such as name, email, phone number, date of birth, and address. This works best on customer-based records and generates a group of similar records represented by a unique match ID. The model provides a confidence score with each match group, which serves as an accuracy measurement of the prediction, making it easy for data analysts to rank the most accurate match groups based on their confidence scores. ML-based matching can be more powerful than rule-based matching because it looks at the complete record more holistically and accounts for errors and missing information that traditional rule-based matching engines cannot find.

The data service provider workflow also allows you to enhance your records with new columns of information such as demographic or psychographic information to develop better insights for campaign planning use-cases. It allows you to expand your audience records by matching with industry-leading, trusted IDs such as Ramp ID, TruAudience, and UID 2.0. Using this workflow can save you time, resources, and undifferentiated heavy-lifting by not having to build bespoke data integrations with each provider. The workflow also enables you to interoperate with different advertising and marketing platforms more efficiently by providing the ability to translate customer records with source IDs into any destination IDs. Using destination IDs as tokens to represent your customers helps you minimize the need to share your customer records with advertising or marketing platforms, enabling you to better protect your customer data. 

Amazon Connect Customer Profiles is powered by AWS Entity Resolution to equip contact center agents with a unified view of a customer’s profile with the most up-to-date information, providing more personalized customer service. AWS Entity Resolution will become available within AWS Clean Rooms in the coming months, which will allow you to natively conduct entity resolution within your AWS Clean Rooms collaboration without sharing underlying raw data across parties. You can use the output of AWS Entity Resolution as an input of Amazon Connect Customer Profiles as well as AWS Clean Rooms.

If you have existing extract, transform, and load (ETL) pipelines set up in AWS Glue for data cataloging and analytics, you can use AWS Entity Resolution to link, match, and deduplicate records in Amazon S3. This can help you improve data quality throughout your applications, channels, and data stores.

Amazon Ads customers can start using AWS Entity Resolution to automatically normalize, hash, and deduplicate first-party data signals prior to uploading that data into Amazon Ads applications such as Amazon Marketing Cloud (AMC) and Amazon DSP. As a result, Amazon Ads customers save time because they no longer need to perform manual data preparation steps required prior to using their first party in Amazon Ads. Customers can use AWS Entity Resolution data output to normalize and hash data according to the needs of Amazon Ads, which operates on pseudonymized signals.

AWS Entity Resolution rule-based and ML-powered workflows are generally available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (London).

AWS Entity Resolution data service provider workflow is generally available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon).

From the AWS Management Console, start by selecting your dataset and defining fields you would like to use. Then create a matching workflow from one or more data inputs, and select the matching technique to use, such as rule-based, ML-powered, or data service provider matching. You can customize the matching workflow by defining the rules to be used for matching and input fields based on your business logic. Configure the fields to be included in the output and provide the destination location in your S3 bucket after a matching workflow is completed. You can also use the lookup API to retrieve matching results, helping you personalize customer experience in near real time. Visit AWS Entity Resolution to get started. 

With AWS Entity Resolution, you are charged per 1,000 records processed. You can process records using different matching techniques including rule-based, machine learning (ML) model-powered, or data service provider matching to link and enhance your records. Visit AWS Entity Resolution Pricing  for additional information.

Yes, before you can use the data service provider matching workflow you must have a subscription in place. You can use the public subscriptions listed on AWS Data Exchange (ADX) , or purchase a private subscription directly with the data service provider of your choice, and then use Bring Your Own Subscription (BYOS) to ADX. 

With AWS Entity Resolution, you have the ability to configure and customize advanced matching techniques to best meet your business needs. AWS Entity Resolution is a highly configurable service that supports up to 15 rules to optimize for target match rates. Also, AWS Entity Resolution is powered by an advanced ML model that analyzes patterns across all records to account for spelling errors and missing information, which traditional rule-based matching engines cannot find.

Security and Data Protection

You can use your own customer managed key (CMK) to customize your encryption settings through the AWS Management Console or API. If you do not provide your own CMK, the service will use a default encryption key to protect the data. The service also supports regionalization, in which all data is processed and operated in the same AWS Region that you are using it for rule-based matching. You can also encrypt and hash the data output in Amazon S3 to ensure it is protected before using your resolved data in other applications.

AWS Entity Resolution provides you with encryption and hashing capabilities to increase control over your data during the matching process. As an AWS customer, you are responsible for assessing the risk of each matching workflow, including the risk of reidentification, and conducting your own additional due diligence to ensure compliance with any data privacy laws. If the data you are sharing is sensitive or regulated, we recommend you also use appropriate legal agreements and audit mechanisms to further reduce privacy risks.

You are responsible for conducting your own legal diligence to ensure your compliance with any data privacy laws. AWS Entity Resolution supports GDPR compliance and is covered by the AWS GDPR Data Processing Addendum (AWS GDPR DPA) that incorporates the commitments of AWS as a data processor. The AWS GDPR DPA is available for customers who require this to comply with the GDPR.

Yes, the AWS HIPAA compliance program includes AWS Entity Resolution as a HIPAA eligible Service . If you have an executed Business Associate Agreement (BAA) with AWS, you can now use AWS Entity Resolution for workloads that are within the scope of HIPAA. If you don't have a BAA or have other questions about using AWS for your HIPAA-compliant applications, contact us for more information.

To learn more, see the following resources:

AWS HIPAA Compliance page

AWS Cloud Computing in Healthcare page