OpenJaw Blueprint: Identity Resolution
Product Overview
Blueprint: Identity Resolution facilitates building a single customer view of your data by linking customers records across data sources in a probabilistic manner. Linking customer information across data sources leads to more comprehensive customer profiles, allowing greater personalised experiences.
The algorithm expects a dataset stored in s3 that meets the schema requirements set out in the documentation. The results are output to another location in s3 specified by the user.
An intuitive GUI allows algorithm configuration, such as: choice of matching thresholds for each individual field, as well as the choice of matching rules that decide if two records belong to the same customer.
The algorithm does generic data cleaning, uses fuzzy matching to quantify similarities between fields and graph algorithms to link records together to create customer profiles via a unique ID, called the OpenJaw ID.
OpenJaw's Identity Resolution program is highly scalable, leveraging the power of AWS Elastic Map Reduce (EMR), along with Apache Spark.
Billing is based on the type of EC2 instance used, the number of instances in the EMR cluster and the time the program is run for, based on a per hour rate.
Additional documentation can be requested by email using the support address given in support information
Version
Video
Operating System
Linux/Unix, Amazon Linux 2018.03.0
Delivery Methods