How Liberty Mutual built a highly scalable and cost-effective document management solution
With more than 45,000 employees in 29 countries, Liberty Mutual is the sixth largest global property and casualty insurer, and currently ranks 71st on the Fortune 100 list of largest corporations in the US.
The expectations of customers continue to increase as the pace of change accelerates, the nature and magnitude of risk change, and digital and data solutions proliferate. Driving innovation in insurance is a core part of Liberty Mutual’s culture. As such, one of the company’s long-term priorities is to partner and develop winning multi-business strategies that meet the evolving needs of our customers.
In this post, we discuss how we built an innovative, highly scalable and cost-effective document management solution that can expand globally on AWS in pursuit of meeting the needs of our customers.
Why build our own document management solution?
Previously, we were using third-party document management on premises, and continuously saw challenges with management, scalability, reliability, and cost.
To overcome these challenges and follow our strategic imperative to move our legacy document storage/management application to the cloud, we were tasked with:
- Ensuring the solution follows our critical security policies
- Taking an API-first approach
- Prioritizing faster time to market, scalability, resiliency
- Supporting a growing global user base
- Building an effective disaster recovery plan
We started with evaluating whether to build our own or use another software as a service (SaaS) solution to meet our needs. As a part of a Liberty-sponsored “innovation sprint”, we were able to build and deliver a working prototype of our own cloud native solution within 10 days.
We decided to iterate on building our own solution because we wanted to have complete control over our code and extensibility, allowing us to prioritize our feature build-out based on our customers’ needs and timelines.
AWS has been our primary cloud provider, and as we evaluated the capabilities and costs, we found that we could meet most of our requirements of document management solutions using native AWS services.
We started a solution design that was generic enough to be easily customizable for all our internal clients, perhaps even allowing us to expand our customer base within the company. The solution offered the following operations in terms of API (to cover the capabilities of our existing legacy solution):
- Upload – Upload a document and metadata
- Update – Modify the metadata associated with a document
- Search – Return a list of documents that meet specific criteria
- Retrieve – Return a link to a document
- Delete – Delete a document
Behind the API, we built a storage space to store documents and metadata. For document storage, we used Amazon Simple Storage Service (Amazon S3), which provides easy-to-use, highly performant, scalable, available and durable storage to meet our high volume of documents. S3 also allows us to meet our disaster recovery requirement by providing ability to automatically copy objects across region using cross-region replication (CRR). We chose Amazon Aurora MySQL-compatible edition for metadata storage consisting of indexes, search criteria, supplementary data, S3 URL path, and to support ad-hoc queries. Aurora also provides ability to automatically scale compute and memory up and down based on application demand. We also wanted the high availability and durability of Aurora, which offers 99.99% availability and replicates six copies of our data across three Availability Zones and backing up out data continuously to Amazon S3.
For compute, we use AWS Lambda to save on cost and only pay for compute we use. In a typical week, we may get from hundreds of thousands of documents uploaded in a batch process to quiet time periods with very little traffic. Lambda also allows us to meet our scalability needs, is easy to deploy globally, and requires no infrastructure to manage requirements. For managing APIs, we use Amazon API Gateway. When a document is uploaded, updated and deleted, we send notification using Amazon Simple Notification Service (Amazon SNS), which is then consumed by our post-processing workflows, for example, auditing, content rendition, notifications.
With security built in at every layer, the following diagram illustrates our high-level architecture:
Document management is critical to Liberty’s business. To make it highly resilient, we use a disaster recovery strategy in which we built automation using Lambda to replicate data from Aurora database to other Regions. You can achieve the same objective using Aurora Global Database. All the resources in this solution are provisioned and deployed using AWS CloudFormation templates. With minimal configuration changes, we are able to re-use the CloudFormation template and replicate the solution internationally to Asia-Pacific, Western Europe, and Latin America Regions.
Along the way, we learned some valuable lessons:
- Enabling Amazon RDS Performance Insights and slow query logs helped greatly with identifying long running queries. RDS Performance Insights enables easy identification of SQL statements causing bottlenecks like high CPU utilization, waits etc.
- Tracing using AWS X-Ray helped us identify intermittent latency issues with KMS and SNS. We were able to mitigate the latency issues by setting up and utilizing VPC endpoints for the services.
- Adhering to best practices for AWS-native API calls can prevent problems when handling large volumes of transactions, e.g., fine-tuning the connection timeout and retry settings.
- To address connection pooling issues when connecting Lambda to Aurora, we decreased the wait timeout and added code to check the connection state and reconnect. You can simplify the same objective by using Amazon RDS Proxy.
We’re also exploring the use of optical character recognition (OCR) technology with services like Amazon Textract and Amazon Rekognition to extract key information and enhance data analytics and search capabilities.
This solution serves internal clients globally, storing billions of documents, and handling thousands of transactions per day. We’re continuing to onboard new internal clients and migrate additional documents from legacy storage.
The flexibility of our AWS solution allows for scaling as our customer base increases and opens the door for additional improvements to be added to make our product more robust. We are able to plug and play the services we need out of the box with the broad suite of AWS offerings. This allows us to quickly deliver new features in addition to expanding our solution globally.
Working with AWS continues to be a collaborative effort. We have met regularly with our AWS contacts and received guidance on best practices as well as cost saving suggestions. When we ran into problems, they have been quick to help us find solutions, even making small modifications on the AWS side to improve their product.
We have saved significantly on licensing, storage, and server infrastructure. The annual cost for our runtime services alone will be reduced by over 95% due to our move to serverless Lambda functions.
In addition to the cost savings, hosting our API on AWS has made this solution even more performant and it’s easier for new clients to adopt. As we streamline our onboarding process, this will only continue to improve.
In closing, this is an example of what can be achieved when a talented and innovative development team is given the encouragement and tools to create something new. With AWS native components as the building blocks, the team was able to save Liberty Mutual time and money while simultaneously engineering a highly scalable, expandable, and resilient solution. With simplicity at its heart, the underlying design could be followed by others looking to implement their own document management solutions.
Learn more about how Liberty is reimagining and building experiences that will change the way you think about technology in cloud.
About the Authors
Sunitha Ajit is a Principal Software Engineer on the Document Management team at Liberty Mutual. She loves to learn and deep dive on new technologies more recently focused on AWS, enjoys collaborating with cross functional teams to achieve innovative and optimum solutions to complex business problems. Sunita is an avid reader, enjoys going on cross country road trips with her family and watching medical/crime shows.
Alison Bridger is a Solutions Engineer on the Document Management team at Liberty Mutual. She enjoys working on software modernization, focusing on engineering best practices. Outside of work, she has many hobbies including surfing, printmaking, and studying Japanese.
Vijay Suryawanshi is an Enterprise Cloud Architect at Liberty Mutual. He is a technology leader who is passionate about driving innovation and has expertise in adoption and enablement of emerging technologies to solve complex business challenges.
Vikas Shah is an Enterprise Solutions Architect at Amazon web services. He is a technology enthusiast who enjoys helping customers find innovative solutions to complex business challenges. In his spare time, Vikas enjoys building robots, hiking, and traveling.