AWS for Industries

Building a modern, event-driven application for insurance claims processing – Part 2

In Part 1 of this series, you learned how insurance claims processing makes a good industry use case for event-driven architectures. In Part 2, you dive deeper into the application architecture and learn how each component or domain of the insurance claims processing system uses asynchronous events to coordinate communication. You learn why serverless services are well-suited for event-driven architectures. Services such as AWS Lambda, Amazon EventBridge, AWS Step Functions, Amazon Simple Queue Service (Amazon SQS), Amazon DynamoDB, Amazon API Gateway, and Amazon Simple Storage Service (Amazon S3) build a scalable, fault tolerant, and extensible insurance claims processing solution. You can explore the sample application code repository and deploy the application in your AWS account.

Architecture overview

AWS Cloud Architecture Overview

This architecture supports the four main functionalities of an insurance claims processing application:

  1. On-boarding a new customer.
  2. Uploading documents (such as a driver’s license and car images).
  3. Filing a new claim.
  4. Uploading images of the damaged car.

The blue-filled boxes represent event producers and consumers. These services listen for specific event types. For example, the customer service listens to Customer.Submitted events. When these services receive events from the event broker, they process events with the corresponding business logic. Once complete, they emit a new event back to the event broker – for example, Customer.Accepted – to which other services may have subscribed.

Event-driven architectures and serverless services

With the application’s core functionality, domains, and architecture determined, the next question is which AWS services to use to build the application. Although you can build event-driven architectures in various ways, serverless services have a natural affinity with event-driven systems. Furthermore, serverless services only run when invoked by events. They scale up automatically as the event volume increases, and scale down to zero when an application isn’t in use. Serverless services have built-in integrations for receiving events from other AWS services, and EventBridge lets you emit your own custom events or receive events from SaaS applications.

Learn about designing event-driven architectures from Amazon’s Chief Technology Officer, Dr. Werner Vogels’ re:Invent 2022 keynote, and more about Serverless Land.

Application walkthrough

This section walks through each component of the application, including APIs, domains, and the flow of asynchronous events.

Setup

To run the sample application, follow the setup steps mentioned in the sample application’s README file.

APIs

Four APIs use the same underlying architecture to pass data between the application’s frontend and backend. API Gateway invokes a Lambda function that queries DynamoDB for the read API endpoints for retrieving customer and claim data. The write API endpoints allow users to submit information during new customer on-boarding or filing a claim. API Gateway invokes a Lambda function, which puts an event on the EventBridge event bus where event consumers can subscribe to it.

API Overview

Core domains

The domains used in this architecture include customer, document processing, claims, fraud detection, and notifications.

Customer

When a customer onboards, they must provide information about their identity (email, SSN, and more) as well as the make and model of their car. The KYC (Know Your Customer) process uses this information to approve and onboard a customer or deny the on-boarding.

Know your customer signup process for onboarding

  1. The Signup API emits a Customer.Submitted event to the EventBridge custom event bus.
  2. We set up an EventBridge rule to route event payloads of the type Customer.Submitted to the Customer service domain.
  3. Upon successful rule evaluation, EventBridge asynchronously invokes a Step Functions state machine in the Customer service domain.
  4. The state machine orchestrates the signup process and emits Customer.Accepted or Customer.Rejected events back to the event broker based on the success or failure scenario.

The customer signup process uses a Step Functions Express Workflow, as shown in the following. The workflow parses and validates the incoming payload data, and saves the customer and policy information in DynamoDB. Then, it generates pre-signed URLs for buckets to store the driver’s license and car images as the next step for the customer. For both successful and failed validations, the workflow emits events (Customer.Accepted or Customer.Rejected) back to the custom event bus.

Step Functions Express Workflow

Document processing

Once a customer successfully signs up, their next step is to upload a valid driver’s license image and the image of their car.

Document Processing Workflow

The sample application prompts the customer to upload the driver’s license image to extract additional identity information from the image. We use this information for validation and document fraud determination.

  1. The customer uses the pre-signed URLs provided during the signup process to upload those two images.
  2. Once the images are uploaded to the policy documents S3 bucket, Amazon S3 triggers an Object Created event to the default EventBridge event bus.
  3. An EventBridge rule matches the Object Created event type and invokes the document processing Step Functions workflow that is used to classify and analyze the image in the S3 bucket.
  4. Once the document processing is complete, the Step Functions workflow emits a Document.Processed event to the custom event bus with data from the analysis added to the event payload.

The document processing Step Functions workflow orchestrates the classification and analysis of the uploaded documents:

Step Functions

  1. The Step Functions Express Workflow extracts the document ID or the object key from the payload.
  2. Then the workflow classifies the documents as either a driver’s license or a car image by using Amazon Rekognition’s DetectLabels API and runs the workflow steps based on the detected label.
  3. For the driver’s license image, Amazon Textract’s AnalyzeID API is used to extract structured data from the image. A Lambda function transforms the extracted information and then emits a Document.Processed event with document type DL back to the custom event bus.
  4. For car images, Amazon Rekognition Custom Labels let you detect the color of the car (for simplification, this sample application uses mock APIs). Then the workflow emits a Document.Processed event with the document type as Car.

If the application grows to include more document types or further analysis, then the Step Functions workflow is easy to extend with new steps.

Fraud detection

The fraud detection service subscribes to the Document.Processed event to check for document-specific fraud.

Fraud detection

In the sample application, the fraud service accesses Customer and Claims tables to compare data from the processed document versus data provided by the customer during on-boarding. If there’s a mismatch – for example, the driver’s license number in the uploaded image doesn’t match the data submitted by the customer – then the fraud service emits a Fraud.Detected event with the fraud type included in the event payload back to the custom event bus. When no document fraud is detected, the service emits a Fraud.Not.Detected event.

The fraud detection service in the sample application is currently a single Lambda function, but much more complex fraud detection is possible. This service could be another Step Functions workflow, a service running on Amazon ECS Fargate or Amazon Elastic Kubernetes Service (Amazon EKS), an inference endpoint for an Amazon SageMaker model, or a third-party vendor implementation.

Claims

Now the customer can file an insurance claim. The first step in the claims process is First Notice of Loss.

Claims workflow

  1. The customer submits the details of the incident (date of incident, number of parties involved, location of incident, etc.) through the frontend application.
  2. This claim submission invokes the FNOL API, which emits a Claim.Requested event.
  3. To allow the claims service to handle a sudden spike in traffic, such as processing home insurance claims during extreme weather events or natural catastrophes including floods, hurricanes, earthquakes, or tornadoes, calls from the FNOL API are sent to an Amazon SQS queue. The queue acts as an event store, allowing the service to buffer events and not overwhelm any downstream services.
  4. A claims processing Lambda function polls from the queue and begins work on the claim request. The claims processing Lambda function verifies FNOL information from the payload and, upon successful validation, persists the information in the Claims DynamoDB table. Then, it creates an Amazon S3 pre-signed URL in order for the customer to upload images of the damaged car.
  5. The Lambda function emits a Claim.Accepted or Claim.Rejected event back to the custom event bus based on success or failure.

Notifications service

The notifications service sends notifications from the claims processing application to the customer. A notification Lambda function subscribes to all of the event types that help the customer stay informed about their account and claims statuses. The notification Lambda function uses AWS IoT Core to notify the event to the user via a web socket. Besides web sockets, the notifications service can also use services like Amazon Pinpoint to send email, SMS, or push notifications to the customer when an event occurs.

Notifications workflow

Implementation Details

Event Flow

We choreograph events in between business domains. With EventBridge, the event choreography is possible through EventBridge rules. Rule matches the incoming event and sends them to targets for processing. The sample application uses AWS Cloud Development Kit (AWS CDK) for Infrastructure-as-Code (IAC).

new Rule(this, "CustomerEventsRule", {
  eventBus: bus,
  ruleName: "CustomerEventsRule",
  eventPattern: {
    detailType: ["Customer.Submitted"],
  },
  targets: [
    new LambdaFunction(notificationLambdaFunction),
    new SfnStateMachine(createCustomerStepFunction),
  ],
});

You can interpret the above rule as any event delivered to the bus with an event pattern that matches the detailType attribute in the event payload to Customer.Submitted, which invokes two targets: notification Lambda function and Step Functions state machine to create the customer. This rule definition demonstrates how a producer becomes loosely coupled with consumers. The rule definition can have more conditions.

Each video shows different interactions that happen in the sample application behind the scenes, which each give better visibility of the event choreography.

Customer signup

Customer Onboarding

Signup image upload (Driver’s License)

Customer signup object upload

Claim requested

Claim Requested

Damaged car image upload

Claims image upload

Extensibility

The event-driven mechanism facilitates an extensible architecture. Events can affect new and existing applications.

Choice of compute and event broker

Although serverless is well-suited for the event-driven application, you aren’t limited to Lambda functions as the choice of compute, or EventBridge as the choice of event broker. A new vendor service domain can run its business logic on compute services such as Amazon ECS Fargate or Amazon EKS, as long as the service subscribes and produces events. The event broker can be Amazon Simple Notification Service (Amazon SNS), Amazon Managed Streaming for Apache Kafka (Amazon MSK), or Amazon MQ instead.

Choice of compute and event broker

Polyglot

The underlying language of choice for implementing the business logic becomes immaterial as the domains asynchronously rely on events. Different teams responsible for different services can use their language of choice to build, develop, test, and release their features independently.

Polyglot

Reduced fault impact radius

With this architecture, you reduce the impact of a faulty domain while adding new features to the entire application. A new reporting service domain can start observing and producing events to and from the event broker. When this new service consumes events, neither the producer nor the other consumers must be aware of its existence. This makes the entire architecture flexible. If there is an issue in the reporting service, then the entire application degrades gracefully, affecting no other domain functionality. Since the reporting service isolates from other domains, testing and releasing features in the reporting service is also faster and isolated. Therefore, you gain agility and can release to market faster.

Third party integrations

Application teams build some components of the architecture internally, and they sometimes rely on third-party providers. For example, an insurance organization can provide a rental car choice to the insurer while the claim is in progress. A third-party vendor can offer a car rental service which integrates with all rental car companies and provides a negotiated rate for rental cars to the insured customer. EventBridge provides API destinations to the 3rd party service’s REST endpoint as a target. The rule can target to this API destination. This third-party provider calls a webhook provided by the insurance organization once its execution completes. Alternatively, a Step Functions workflow can orchestrate calling third-party APIs, reconcile results, and emit events to the event broker again.

Third party integrations

CRM system integration

Customers more often use Customer Relationship Management (CRM) systems like Salesforce. In an insurance example, a CRM system can own customer and claim data as the record source, and then integrate with the core insurance business running on AWS. Creating a customer record and filing a new claim (FNOL) are some of the operations that can happen in Salesforce. An insurer uses a mobile app, web app, SMS, or calls an agent via phone in those cases.

CRM system integration

Salesforce emits platform events on customer creation, FNOL, and sends to an SaaS event bus or Partner event bus in EventBridge. Salesforce integrates with the EventBridge event bus for bi-directional communication. This integration enhances the event-driven architecture that is in place. For example, a customer files a claim by talking to an agent. This agent uses Salesforce to capture the case details. The customer can describe their condition and their car damage. AWS’ artificial intelligence/machine learning (AI/ML) services can analyze the text, images, or speech recorded by the agent and provide sentiment analysis data or detect fraud. EventBridge sends this analyzed data back to Salesforce. The next agent talking to the customer can personalize the experience by knowing the sentiment or context of the previous discussion.

SaaS Event Bus

Clean up

To clean up the application, follow the clean up steps shown in the sample application.

Conclusion

In this part of the series, we dove deeper to understand the architecture of insurance claims processing. We saw how distributed systems not only choreograph events in between domains but also orchestrate workflows inside of the domain. Choreography and orchestration work better together. Guidance around zero touch claims processing using AI/ML services is available today. Event-driven architecture facilitates bringing these components closer seamlessly.

The sample application shows the implementation details and steps to set up, run, and clean up the application. Subsequent parts of this series will dive into each aspect of the application regarding security, observability, operational excellence, performance, and cost optimization. From a development standpoint, the sample application will also grow to highlight aspects such as testing, development best practices, effective use of AWS CDK, reusable patterns, and more.

As mentioned in Part 1, the event-driven architecture is a concept and a mental model. This concept applies to all industries, not just the insurance industry. You articulate your business use case on the same foundation as the event-driven architecture that is discussed above.

See these resources to learn more:

Any discussion of reference architectures in this post is illustrative and for informational purposes only. It is based on the information available at the time of publication. Any steps/recommendations are meant for educational purposes and initial proof of concepts, and not a full-enterprise solution.

Emily Shea

Emily Shea

Emily Shea is Head of Application Integration Go-To-Market at AWS. Emily is a builder, leader, and speaker on topics including serverless, event-driven architectures, and application integration.

Vaibhav Jain

Vaibhav Jain

Vaibhav Jain is a Cloud Application Architect with AWS Professional Services. He works with customers and AWS partners to architect, design, develop, and implement solutions that result in real business outcomes.

Dhiraj Mahapatro

Dhiraj Mahapatro

Dhiraj Mahapatro is a Principal Serverless Specialist Solutions Architect at AWS. Dhiraj specializes in helping enterprise financial services adopt serverless and event-driven architectures to modernize their applications and accelerate their pace of innovation.