AWS Entity Resolution Features

AWS Entity Resolution helps you more easily match, link, and enhance related customer, product, business, or healthcare records stored across multiple applications, channels, and data stores.

Set up resolution workflows in minutes

Get started in minutes instead of months, and use entity resolution workflows that are flexible, scalable, and seamlessly connectable to your existing applications.

african businessman with computer at office
Checklist and filling survey form online. Document management system, DMS. Assessment form, questionnaire, checklist .Quality control and accuracy Check completeness.using laptop and tablet

Improve data accuracy to meet your needs

Use flexible, easy-to-configure machine learning and rule-based techniques to optimize record matching based on your business needs.

Match and enhance your records with data service providers

Translate and enhance your records with datasets and IDs from trusted data service providers in a few clicks to better understand, reach, and engage your customers.

Business partnership handshake concept.Photo two coworkers handshaking process.Successful deal after great meeting.Horizontal, blurred background.Wide.
technology, login, information, security, privacy, business, protection, protect, padlock, password. touching padlock hud to view cloud technology system. that tech is protect your data and security.

Protect your data by minimizing its movement

AWS Entity Resolution helps you and your organization better protect your data by minimizing its movement, as it reads your records where they already live.

Flexible and customizable data preparation

AWS Entity Resolution reads your data from Amazon Simple Storage Service (Amazon S3) to use it as inputs for match processing. You can specify a maximum of 20 data inputs. Each row of the data input table is processed as a record, with a unique identifier serving as a primary key. AWS Entity Resolution can operate on encrypted datasets. You need to also define the schema mapping for AWS Entity Resolution to understand which input fields you want to use in your matching workflow. You can bring your own data schema, or blueprint, from an existing AWS Glue data input or build your custom schema using an interactive user interface or JSON editor. By default, data inputs are also normalized prior to matching to improve match processing such as removing special characters and extra spaces and formatting text to lowercase. You can turn off normalization if your data input has already been normalized. We also provide a GitHub library, which you can use to further customize the data normalization process to suit your needs.

Configurable entity matching workflows

An entity matching workflow is a sequence of steps you set up to tell AWS Entity Resolution how to match your data input and where to write the consolidated data output. You can easily set up one or more matching workflows to compare different data inputs and use various matching techniques, such as rule-based, ML-powered, or data service provider matching—without entity resolution or ML experience. You can also view the job status of existing matching workflows and metrics, such as resource number, number of records processed, and number of matches found.

  • Ready-to-use rule-based matching: This matching technique includes a set of ready-to-use rules in the AWS Management Console or command line interface (CLI) to find matches, based on your input fields. You can also customize the rules (such as adding or removing input fields for each rule), delete rules, rearrange the priority of rules, and create new rules. You can also reset the rules to return them to their original configurations. The data output in your S3 bucket will have match groups. They are generated by AWS Entity Resolution using the rule-based matching where each match group has the rule number used to generate that match associated to it, helping you understand the fidelity of the match. For example, the rule number can demonstrate the precision of each match group so that the first rule is more precise than the second rule and so on. 

  • Preconfigured ML matching: This matching technique includes a preconfigured ML model to find matches across all of your data inputs, including consumer-based records. The model uses all input fields associated with name, email address, phone number, address, and date of birth data types. The model generates match groups of related records with a confidence score in each group that explains the quality of the match relative to other match groups. The model takes into consideration missing input fields and analyzes the entire record together to represent an entity and cannot be customized. The data output in your S3 bucket will have match groups. These are generated by AWS Entity Resolution using the ML-based matching where each match group has a confidence score associated to it between 0.0 to 1.0 explaining the precision of the match. 

  • Matching records with data service providers: With AWS Entity Resolution you can match, link, and enhance your records with leading data service providers to better understand, reach, and engage your customers. With this matching workflow you can enhance your records though column appends, or you can translate customer data into data service providers IDs to meet your business goals, such as delivering more relevant and complete customer experiences. With a few clicks from your AWS management console, you can create a matching workflow with data service providers removing the need to build and maintain custom integrations. You must have a subscription with these data service providers to take advantage of this matching technique. You can either create a subscription using a public listed offer in AWS Data Exchange or bring your existing subscription with the provider using the Bring Your Own Subscription (BYOS) offer from AWS Data Exchange.

Manual bulk processing and automatic incremental processing

Data processing helps you convert your data inputs into a consolidated data output table with similar records that have a common match ID generated using entity matching workflow configurations. Using the API and the AWS Management Console or CLI, you can then run manual bulk processing on demand, based on your existing extract, transform, and load (ETL) data pipeline. This reprocesses all data for any new matches and updates to existing matches. Also, for rule-based matching scenarios, you can initiate automatic incremental processing. As soon as new data is available in your S3 bucket, the service reads those new records and compares them against existing records, keeping your matches up to date with any changes in S3 data.

Near real-time lookup

Looking up any entity match ID through the AWS Entity Resolution Get Match ID API helps you synchronously retrieve an existing match ID. You can call AWS Entity Resolution with personally identifiable information (PII) attributes acquired through multiple sources and channels. AWS Entity Resolution hashes those attributes for data protection and retrieves the corresponding entity match ID to link and match the customer. For example, you can get a web sign-up with an associated name, email, and address. You use the AWS Entity Resolution API to find out if this customer or entity already exists in your matched results stored in your S3 bucket, along with the corresponding entity match ID associated to it. Once you get the entity match ID, you can find the transactional information associated to it in your source applications, such as your customer relationship management (CRM) or customer data platform (CDP) systems.

Data protection and regionalization by design

AWS Entity Resolution offers a default encryption capability that will help you protect your data and will provide you with an encryption key for every data input into the service. Both AWS Entity Resolution and its data encryption capability support regionalization to where the data is processed, and they operate in the same AWS Region from where you are using data for matching. For example, AWS Entity Resolution gives you the flexibility to bring server-side previously encrypted and hashed data to run rule-based matching workflows. Finally, you can also encrypt and hash the data output in Amazon Simple Storage Service (Amazon S3) before using your resolved data in other applications.