AWS for Industries

Solve customer identity fragmentation at scale with AWS Entity Resolution

Enterprises often struggle with fragmented and inconsistent consumer data spread across multiple applications. This can create challenges in achieving trusted, unified views of your customers that power important functions, such as compliance reporting, analytics, and consumer engagement.

Amazon Web Services (AWS) Entity Resolution helps companies quickly match, link, and enhance related customer records so organizations can move from scattered records to unified customer views. Companies can use flexible and configurable rules, or machine learning matching techniques to match and link records based on their business needs.

For example, in the fitness and wellness industry, organizations collect customer data across a wide range of touchpoints, from digital engagement channels and mobile apps to on-site interactions, billing systems, and loyalty programs. Each system captures a fragment of the overall customer journey, often with variations in names, contact details, or other information.

The same individual might appear differently across platforms, with a shortened name in one system, an outdated phone number in another, and a slightly different email address elsewhere. Without a unified view of the individual, these inconsistencies can lead to duplicate records and disjointed customer experiences.

In practice, many organizations try to handle fragmented customer information through master data management (MDM) platforms or by building custom in-house solutions. However, these systems often fall short when faced with today’s scale and complexity. They have limited flexibility in updating rules, high operational costs, and poor adaptability to evolving data sources.

The fitness and wellness industry needs an approach that can keep pace with growing data volumes and frequent updates to data over time. The approach we will describe is a more scalable, cloud-native solution that can handle millions of records effectively, while maintaining auditability, compliance, and reducing costs.

By leveraging AWS Entity Resolution, companies can reconcile fragmented records into unified views of their customers. This verifies that every interaction, whether it’s an introductory engagement, a subscription renewal, or ongoing participation in wellness programs, is connected to a single, trusted profile. The result is a 360-degree view that helps provide a smoother customer experience with highly personalized touchpoints of interaction. AWS Partner PricewaterhouseCoopers (PwC) helped design and implement a similar solution for a customer using AWS Entity Resolution along with support from AWS. That solution achieved more than an 80% reduction in operational costs and over a 95% improvement in processing time compared to the previous system. It transformed what was once a complex, resource-intensive workflow into a seamless, cloud-native operation.

We will discuss how organizations in the fitness and wellness industry can build a robust, and scalable solution using AWS Entity Resolution. To illustrate this, let’s consider the case of a fitness and wellness organization that relied on a fragmented identity system. This led to the following business challenges:

  • Inconsistent customer experience: Without a proper unified view of their customers, the same person was being targeted multiple times or missed altogether. This led to ineffective personalization, fragmented engagement, and ultimately weakened consumer trust and loyalty.
  • Unreliable analytics and decision-making: Duplicate and incomplete records inflated consumer counts, distorted engagement metrics, and made KPIs unreliable. Business leaders lost confidence in analytics, limiting their ability to make informed decisions.
  • High operational costs: To keep data consistent, they performed full daily loads of all records. This brute-force approach reprocessed duplicates and unchanged data, driving up time to process, storage, compute, and licensing costs.
  • Frequent information changes: Frequent updates to details like email, phone, or address overwrote older values. This erased valuable history, altered the source of truth, and forced continuous re-evaluation of consumer identities.
  • Fragmented consumer data across multiple sources: Consumer records are scattered across multiple systems, each with different data quality issues. Business teams didn’t have the flexibility to filter data (for example, by location), reprocess historical records both fully and incrementally, and maintain complete traceability of ingestion runs and match decisions.

To alleviate the preceding challenges, this company sought to move to a 360-degree view of the customer solution that can:

  • Ingest data from various sources with different schemas and data quality.
  • Deduplicate and normalize the data prior to sending it to the matching engine.
  • Create a golden record that downstream systems can trust for analytics and engagement.
  • Support both full and incremental data processing as business rules evolve.
  • Provide transparency through auditability and traceability, with clear visibility into data ingestion runs, resolution outcomes, and identity merges or splits over time.

For our solution we built a modular pipeline, capable of performing resolution of records, at scale, across diverse data sources. It ingests customer data from multiple systems, each with incomplete or inconsistent details, and applies AWS Entity Resolution rule-based matching. This confirms duplicates are removed and records are unified into a single, trusted view of the customer.

The solution

Figure 1 – High-level solution diagram

Figure 1 – High-level solution diagram

Following is a description of the high-level steps of our solution workflow:

  1. Upstream data capture: The solution collects consumer data from physical locations (such as gyms and studios), digital channels (websites, mobile apps, kiosks), and partner apps. It records data across core systems including membership, leads, trial and guest passes, billing, and retail platforms. These systems stream or batch their data into an Amazon Simple Storage Service (Amazon S3) landing zone, forming the raw input layer for subsequent processing.
  2. Orchestration trigger: Amazon EventBridge triggers AWS Step Functions on a defined schedule or event, initiating the AWS Entity Resolution pipeline.
  3. Pre-processing and standardization: Step Functions invoke an AWS Glue ETL job that can ingest and pre-process incremental source data, applying normalization and cleaning rules to prepare it for matching.
  4. AWS Entity Resolution execution: An AWS Lambda function starts the rule-based matching workflow in AWS Entity Resolution, which reads incremental data from the Amazon S3 landing zone and processes it.
  5. Workflow monitoring: The Lambda Function tracks the status of the AWS Entity Resolution job until completion, verifying reliability and fault tolerance.
  6. Match results output: Once complete, AWS Entity Resolution writes the matched and grouped results to an output S3 bucket.
  7. Golden record derivation: An AWS Glue post-processing ETL job consolidates records into a single golden record for each identity cluster, prioritizing the most complete and recent attributes.
  8. Analytical data store: A curated subset of unified golden records is written to Amazon Redshift for fast analytical queries and advanced workloads. Additional AWS Glue ETL jobs can enrich this data with aggregations, segmentation, or KPI calculations to support business-specific analytics.
  9. Downstream activation: The unified golden records are provided to downstream systems (such as analytics and business intelligence) for accurate KPIs, customer relationship management (CRM), marketing platforms for personalized campaigns, and customer support systems for a consistent service experience.
  10. Traceability, audit, and security: The entire pipeline is instrumented for governance and compliance. AWS CloudWatch enables observability, AWS Glue Data Catalog and Amazon Athena provide lineage and traceability of match decisions. AWS Identity and Access Management (IAM) and AWS Secrets Manager enforce secure access to data and credentials. Together, these controls verify overall auditability and safeguard sensitive personally identifiable information (PII) across the pipeline.

By building the described solution, a customer can:

  • Improve consumer engagement: Faster time-to-insight, better customer engagement strategies, and reduced operational inefficiencies all driven by a trusted, unified customer identity.
  • Create a unified consumer view: By designing a flexible data integration framework, the solution connects fragmented data across departments and platforms, enabling a single, trustworthy consumer profile as a key enabler for personalization and compliance.
  • Improve data quality and trust: Inconsistencies in customer data from duplicate records to incomplete information are resolved through intelligent normalization and rule-based processing, significantly increasing the confidence in downstream analytics and reporting.
  • Build a scalable and future-proof architecture: The pipeline is built to scale with data volume and velocity. It confirms consistent performance even as customer records grow into the millions without needing major infrastructure overhauls.
  • Governance and transparency built in: Every decision, from data match to golden record creation, can be track with detailed audit trails. This meets regulatory requirements and empowers stakeholders to assess and evolve the logic based on business needs.

Conclusion

AWS Entity Resolution can help retail organizations improve their customer understanding by seamlessly connecting disparate touchpoints. It merges online browsing, in-store purchases, loyalty programs, and service interactions into unified profiles, enabling personalized experiences. This cohesive view helps eliminate disjointed customer journeys, providing retailers with a way to deliver consistent, contextual messaging that can accelerate conversion while reducing marketing waste.

Similarly, in health and fitness, AWS Entity Resolution unifies fragmented data across digital platforms, in-person visits, wearables, and membership programs. This holistic view can enable hyper-personalized workout recommendations, nutrition guidance, and motivational content aligned with individual goals.

Precise targeting based on usage patterns and preferences, rather than isolated data, boosts membership retention and program participation. By solving identity fragmentation, AWS Entity Resolution helps businesses create seamless cross-channel experiences that help strengthen brand loyalty, while maintaining privacy standards.

To get started, visit the AWS Entity Resolution console or contact an AWS Representative to know how we can help accelerate your business.

Further reading

Punit Shah

Punit Shah

Punit is a Senior Solutions Architect at Amazon Web Services, where he is focused on helping customers build their data and analytics strategy on the cloud. In his current role, he assists customers in building a strong data foundation layer using AWS services like AWS Entity Resolution, and Amazon Connect. He has 15+ years of industry experience building large data lakes.

Adam Hood

Adam Hood

Adam Hood is a Partner and AWS Data and AI Leader at PwC US. As a strategic and results-oriented technology leader, Adam specializes in driving enterprise-wide transformation and unlocking business value through the strategic application of digital systems, data, and GenAI/AI/ML. He has guided organizations through complex digital, finance, and ERP modernizations.

Rajat Mathur

Rajat Mathur

Rajat is a Principal Solutions Architect at Amazon Web Services. Rajat is a passionate technologist who enjoys building innovative solutions for AWS customers. His core areas of focus are GenerativeAI, IoT, Networking, and Serverless computing. In his spare time, Rajat enjoys long drives, traveling, and spending time with family.

Srikrishna Srinivasan

Srikrishna Srinivasan

Srikrishna Srinivasan is a Data Engineer with 5+ years of experience in building end-to-end data pipelines and intelligent automation solutions across cloud and on-premises environments. He has worked across AWS, Azure, and Snowflake, leveraging tools and services to design scalable data architectures. He also applies conversational AI to enhance customer engagement and workflow automation.

Yash Munsadwala

Yash Munsadwala

Yash Munsadwala is a Manager within PwC Cloud, Engineering, Data & AI (CEDA) practice. He specializes in architecting and delivering Data & AI, Web Architecture, and Cloud Modernization initiatives that help enterprises accelerate digital transformation. Yash leverages his expertise in software engineering, data architecture, and AWS-native services to build scalable, secure, and resilient solutions across industries.