Identity Graphs on AWS

Build a customer identity graph for real-time personalization and advertising targeting

What is an identity graph?

An identity graph provides a single unified view of customers and prospects based on their interactions with a product or website across a set of devices and identifiers. An identity graph is used for real-time personalization and advertising targeting for millions of users. This is done by linking multiple types of identifiers to form a consistent, unified view of the customer. An identity graph can also store profile data and easily connect new consumer identifiers to profiles.

Identity graphs can provide a 360° view of customers to understand the customer journey in chronological order or make recommendations to close a deal. An identity graph also helps you build customer data platform (CDP) solutions with an emphasis on privacy regulation compliance. Identity graphs are a key solution for many advertising technology and marketing technology companies, as well as brand and marketing organizations, advertising agencies, holding companies, and web analytics providers.

You can build your identity graph solution using Amazon Neptune, a fast, reliable, fully managed graph database service.

AWS re:Invent 2019: Reimagining advertising analytics & identity resolution at scale

Why do you need an identity graph?

Your customers interact with multiple devices, browsers, apps, and email addresses to interact with products and ads. An identity graph allows you to establish persistent identifiers to link all related customer devices and identifiers, enabling you to create unified customer profiles which can be used for targeting and personalization.

Build audiences and segmentation

Your customers view billions of web pages and apps, and generate billions of cookies across devices, generating billions of data relationships that hold hidden insights on customer behavior. An identity graph allows you to tap into these data relationships to create audiences based on similar interests, preferences, and purchases.

Analyze customer journeys

Your customers generate many signals of intent such as search queries, product page views, ad clicks, purchases, and loyalty program enrollments. An identity graph allows you to analyze end-to-end customer behavior to gain a 360° view of your customers so you can better understand purchasing patterns and improve marketing attribution.

Leverage customer data with privacy compliance

Managing your customers’ personally identifiable information (PII) and Non-PII data separately is operationally complex, and some regulations require combining PII and Non-PII sources to support request for information (RFI) and delete requests. An identity graph allows you to store and manage PII and non-PII data together.

Customer identity graphs enable identity resolution, scoring, creation of audience segments, and more

Why use a graph database to build an identity graph?

Traditionally, relational databases were used to build most identity graph solutions. However, relational databases are not efficient at storing and querying the relationships between billions of interconnected entities in today’s consumer environment. As complex SQL queries are required to map these relationships, relational databases are not preferred for managing connected data for real-time cross-device advertising targeting, personalization, and other customer experience use-cases.

Graph databases - which are purpose-built to store and navigate relationships - have emerged as a better-fit data store for identity graph solutions. Graph databases are easy to model for highly connected data, treat relationships as “first class citizens,” have flexible schemas, and provide higher performance for graph query traversals. Using a graph database for an identity graph enables you to link identifiers and update profiles with more easily and query at ultra-low latency — enabling faster updates and more accurate, up-to-date profile data for ad targeting, personalization, analytics, and ad attribution.

Sample graph dataset providing insights about customer identity and behavior  

Using Amazon Neptune to build an identity graph solution

Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. Amazon Neptune is purpose-built for storing billions of relationships and querying the graph with milliseconds latency. Amazon Neptune is compatible with open graph APIs, and supports popular graph models Property Graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL. While graph databases usually require extensive hardware management, provisioning, and manual scaling, Amazon Neptune is a fully managed service, so you no longer have to worry about database management tasks. You can be up and running with an Amazon Neptune graph cluster in a matter of minutes, with a few clicks in the AWS management console or with the AWS CLI.

You can use Neptune to build identity graphs for any identity resolution solutions, including device and social graphs, personalization and recommendations, and pattern detection. You can then expose your graph data to external systems, such as CRM or advertising systems, using Amazon API Gateway.

Sample microservice architecture using Amazon Neptune for an identity graph

Benefits of Amazon Neptune for identity graphs

Highly scalable and available

With Amazon Neptune, you can scale the compute and memory resources powering your production graph cluster up or down by creating new replica instances of the desired size, or by removing instances. Based on your database usage, your Amazon Neptune storage will automatically grow up to 64 TB, in 10GB increments, with no impact to database performance. There is no need to provision storage in advance. Amazon Neptune is highly available, with read replicas, point-in-time recovery, continuous backup, and replication across Availability Zones (AZs).


Amazon Neptune reduces the cost of managing your graph database by eliminating the need for hardware and software investments and reducing operational burden. An identity graph built on Amazon Neptune will enable you to build a cost-effective, scalable, secure, and highly available customer data platform with your own proprietary business rules to respond to customer signals in real-time and inform their advertising and marketing journey orchestration workflows. 

Secure, privacy-compliant

Amazon Neptune is highly secure, with support for encryption-in-transit using HTTPS encrypted client connections and encryption-at-rest using AWS Customer Managed Keys. Amazon Neptune is in-scope for PCI DSS, ISO compliance, and SOC 1, 2, and 3 compliance. You can build identity resolution solutions without jeopardizing regulatory compliance. For more information, read the Neptune user guide.

Customer case study: Zeta Global


“By leveraging [Amazon] Neptune and other AWS services, we are able to achieve a cost-efficient data platform, at scale, in a very short period of time.” 

Sasikala Singamaneni, software engineering manager at Zeta Global.

Getting started

AWS Graph Notebook: Introduction to Identity Graphs

The easiest way to get started with identity graphs on Amazon Neptune is to use the AWS Graph Notebook. Learn how to create an identity graph application that can resolve unique entities across multiple devices, identity users for targeted promotions, and more with this sample application notebook. To follow along the sample application and run interactive queries, you will need to create an Amazon Neptune cluster and Neptune notebook. This sample application and other examples are pre-loaded into every Neptune notebook.

AWS Solution: Customer Identity Graph with Amazon Neptune

We have created a reference application which demonstrates how you can build a cost-efficient, scalable, secure, and highly available identity graph with your own proprietary business rules. It includes a sample identity graph data model, ingestion scripts to load the data, CloudFormation template, and Amazon SageMaker Notebooks to support common use cases such as cross-device graph, audience segmentation, and look-alike modeling.

Get started with Amazon Neptune, a fully managed graph database

Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency. Amazon Neptune supports popular graph models Property Graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.