AWS Contact Center

Easily remove duplicate customer records using machine learning with Amazon Connect

At the beginning of an interaction, contact center agents often spend time navigating between duplicate customer profiles across CRM, marketing, billing, shipping and ticketing systems. While the agent is finding the most accurate customer profile, the customer is waiting on a hold resulting in poor customer experience. As per a report by Experian, as many as 94% of organizations suspect that their customer and prospect data may be inaccurate. For companies without data quality initiatives in place, duplication rates between 10%–30% are common due to multiple sources of customer data, stale customer records, and lack of unique customer identifiers.

You can now automatically detect duplicate customer profiles in your Amazon Connect customer profiles domain with the identity resolution APIs (in preview). Identity resolution uses machine learning to detect duplicate profiles based on similar name, email address, and phone number. For example, two or more profiles with spelling mistakes such as “John Doe” and “Jhn Doe,” or different casing email addresses such as “JOHN_DOE@ANYCOMPANY.COM” and “,” or different phone number formats such as “555-010-0000” and “+1-555-010-0000”, or missing key attributes such as Last Name, Account ID. Once these matches are detected you can merge them into a unified profile via Merge API. Amazon Connect Customer Profiles scans and matches the incoming contact with existing profiles to surface the unified profile at the moment of contact in the agent application, saving agents time spent finding the right profile for incoming contacts and provide a personalized customer experience.

Overview of solution

We will continue with an example from AnyCompany, a leading home services provider, with plumbing, carpentry, and other maintenance services. They use Amazon Connect – an easy to use omnichannel cloud contact center to serve their end customers.
In previous blogs we’ve shown you how AnyCompany has leveraged Amazon Connect Customer Profiles to build a unified profile and enable personalized routing. In this blog we will focus on another business problem AnyCompany faces where they are seeing large amounts of duplicate customer data.
To demonstrate how we solve this problem end to end, we will first ingest some sample customer profile data which contains duplicate records into an Amazon Connect customer profile domain, then enable the identity resolution feature (in preview). Once enabled, the service will run a machine learning job on all the profile records in your domain every Saturday at 12AM UTC to identify matching profiles. The results can take up to seven days to be generated.
Once the batch job has completed, we can use the Get Matches API to receive a list of matched profile IDs. After inspecting the profiles, we can use the merge profile API to merge these duplicate records.



For this walkthrough, you should have the following prerequisites:

  • An AWS account
  • Amazon Connect instance
  • Existing customer profile domain, if you are new to customer profiles you can learn how to create a customer profile domain here.

Create duplicate profiles

For this blog post we will ingest a csv file containing duplicate records into our profiles domain with a predefined object mapping. For more details on this topic please refer to our previous blog post.

  1. Download this CSV file from Github.
  2. Download this object mapping from Github.
  3. Execute the AWS CLI command to upload the CSV file to an Amazon S3 bucket. Note: please execute this command in the same directory as the CSV file.
aws s3 cp sample.csv s3://<yourbucketname>/<prefixname>/sample.csv
  1. Execute the AWS CLI command to create a profile object type with the object mapping downloaded in step2.
aws customer-profiles put-profile-object-type --cli-input-json file://template.json

2. Add an Amazon S3 integration to your customer profiles domain. For detailed steps please follow the blog here.

Search profile

Once the data has been ingested, you can search for it using the SearchProfile API.

aws customer-profiles search-profile --domain-name <your-domain-name> --key-name "_fullName" --values "Nikki Wolf"

Alternatively, you can use the Customer Profile Widget to search for the profiles. You can navigate to the customer profile widget using the instructions here.

Enable identity resolution

Execute the following AWS CLI command to enable identity resolution on your domain. Note update your AWS CLI version to the latest version.

aws customer-profiles update-domain \
--domain-name <YourDomainName> \
--matching '{"Enabled": true}'

Query the matches

The identity resolution service starts a batch process job every Saturday at 12AM UTC to identify matching profiles. After the job completes detecting the matches, we can do the following:

  1. Execute the following AWS CLI command to get the list of matching profiles.
aws customer-profiles get-matches --domain-name <YOUR_DOMAIN_NAME>

2. Execute the following AWS CLI command on the profile Ids returned by the get matches API to check the matches.

aws customer-profiles search-profile --domain-name <your-domain-name>; --key-name "_profileId" --values "<ProfileID from the GetMatches API>"

Merge the matches

You can define custom business logic to merge the matched profiles identified by the service with the Merge Profile API. We need to start by identifying a primary profile, which the other matched profiles will be merged into. Then we specify which values to take from each of the profiles to form our unified profile. Finally, we use the Merge Profile API to merge the profiles which merges all the profiles specified in the API and links their profile objects, such as contact history, together.

For AnyCompany’s use case, they will automatically merge all profiles with the same full name, email and phone number. They have decided to define the primary profile as the one with the most recent last updated timestamp. If there are conflicting values between the matched profiles, they will use the value from the primary profile, however, if there are values that exist from the non-primary profiles then we will take those values.

On the other hand if there is not an exact match for name, email and phone numbers for the matched profiles returned by the Get Matches API, they will create an Amazon Connect Task for an agent to investigate these profiles and then the agents can initiate a workflow to merge these profiles. The code snippet below is an example of a JavaScript implementation of the business logic for AnyCompany.

var customerProfileClient = new AWS.CustomerProfiles();
var connectClient = new AWS.Connect();

var getMatchesResponse = await customerProfileClient.getMatches({
    "DomainName": "MyDomain",
    "MaxResults": 10

var potentialMatches = getMatchesResponse.matches;

if (potentialMatches.length > 0) {
    for (var potentialMatch in potentialMatches) {
        var profileIdList = potentialMatch.profileIds;
        var firstProfile = customerProfileClient.searchProfiles({
            "DomainName": "MyDomain",
            "KeyName": "_profileId",
            "Values": [profileIdList[0]]

        var mostRecentUpdatedTime = Date.parse(firstProfile.Attributes.LastUpdatedTime);
        var mainProfileId = firstProfile.ProfileId;
        for (var i = 1; i < profileIdList.length; i++) {
            var currentProfile = customerProfileClient.searchProfiles({
                "DomainName": "MyDomain",
                "KeyName": "_profileId",
                "Values": [profileIdList[i]]

            var currentProfileLastUpdatedTime = Date.parse(currentProfile.Attributes.LastUpdatedTime)
            if (currentProfileLastUpdatedTime > mostRecentUpdatedTime) {
                mainProfileId = currentProfile.ProfileId
                mostRecentUpdatedTime = currentProfileLastUpdatedTime;

            if (firstProfile.PhoneNumber != currentProfile.PhoneNumber ||
                firstProfile.FirstName != currentProfile.FirstName ||
                firstProfile.LastName != currentProfile.LastName ||
                firstProfile.EmailAddress != currentProfile.EmailAddress) {
                    "ContactFlowId": "MyContactFlowId",
                    "InstanceId": "MyInstanceId",
                    "Name": "MergeProfilesTask",
                    "Description": "Please validate the following profile IDs: " + JSON.stringify(potentialMatch.profileIds)

        var index = profileIdList.indexOf(mainProfileId);
        profileIdList.splice(index, 1);

        var mergeProfilesResponse = await customerProfileClient.mergeProfiles({
            DomainName: "MyDomain",
            MainProfileId: mainProfileId,
            ProfileIdsToBeMerged: profileIdList

Cleaning up

To avoid incurring future charges, delete the resources including the Amazon S3 bucket and the customer profile domain used for this walk through.


In this post, we showed you how to get started with the new identity resolution APIs. To get started you can find our code snippet on Github here and use the UpdateDomain or CreateDomain API to enable it in your account now.