Customer Stories / Software & Internet / India
Observe.AI Cuts Costs by Over 50% with Machine Learning on AWS
Observe.AI developed and open-sourced the One Load Audit Framework on AWS to optimize machine learning model costs, boost developer efficiency, and scale to meet data growth.
50%
lower costs by fine-tuning instance sizes
10x
higher data loads supported
From weeks to hours
Reduced development time
Overview
Observe.AI uses conversation intelligence to uncover insights from live and post-customer interactions, helping companies increase contact center agent performance. The company developed and open-sourced the One Load Audit Framework (OLAF), which integrates with Amazon SageMaker to automatically find bottlenecks and performance problems in machine learning services.
Using OLAF to load-test Amazon SageMaker instances, Observe.AI reduced machine learning costs by over 50 percent, lowered development time from one week to hours, and facilitated on-demand scaling to support a tenfold growth in data load size.
Opportunity | Predicting ML Data Load Sizes for Enhanced Efficiency
Observe.AI optimizes the customer experience through an artificial intelligence (AI)-powered workforce platform. Employing a large language model (LLM) designed for contact centers, Observe.AI enhances contact center agent performance and extracts insights from customer interactions using conversation intelligence. Each month, the platform processes millions of conversations and generates hundreds of inferences per conversation.
As machine learning (ML) adoption continues to grow across industries, testing the performance of customers’ ML services under varying data loads has become increasingly crucial for Observe.AI. Aashraya Sachdeva, staff engineer in machine learning at Observe.AI, says, "While onboarding new customers, we were assessing our ML system's capability to handle a tenfold increase in data load, corresponding to the tenfold rise in conversations processed daily. Our ML engineers and scientists faced challenges in accurately predicting this capability when transitioning models from research to production."
The company sought to deploy a larger ML model in production for enhanced accuracy. Simultaneously, there was a careful effort to manage latency and control costs associated with the implementation. Achieving an optimal return on investment through fine-tuning its infrastructure was key, and the business wanted a solution compatible with its existing Amazon Web Services (AWS) environment.
"We sought a more straightforward method to identify the optimal infrastructure, assess our readiness for increased load, and determine the associated costs for serving code to customers. We also wanted precise insights into the developer time required for implementation," Aashraya explains.
Through fine-tuning Amazon SageMaker instance sizes with OLAF while maintaining a constant data input load, we optimized costs for our LLM deployment by over 50 percent. This process ensured the best return on investment.”
Aashraya Sachdeva
Staff Engineer, Machine Learning at Observe.AI
Solution | Building the One Load Audit Framework on AWS
To address its challenge of predicting ML load sizes, Observe.AI created and open-sourced the One Load Audit Framework (OLAF). Integrated with Amazon SageMaker, a service that builds, trains, and deploys ML models for any use case, OLAF identifies bottlenecks and performance issues in ML services, offering latency and throughput measurements under both static and dynamic data loads. The framework also seamlessly incorporates ML performance testing into the software development lifecycle, facilitating accurate provisioning and cost savings.
Aashraya explains, "OLAF provides our ML engineers and scientists with a plug-and-play model. They simply input their AWS credentials and the Amazon SageMaker endpoint, and the tool conducts load testing, providing latency numbers and expected errors for a particular model or instance."
Following the initial build, Observe.AI integrated Amazon SageMaker features into OLAF, including multi-container deployment and batch inferencing. "We wanted to understand how these incremental features affect scalability in terms of cost," adds Aashraya. Next, the company incorporated Amazon Simple Queue Service (Amazon SQS), a fully managed message queuing service for microservices, distributed systems, and serverless applications. By downloading Amazon SQS load traces, OLAF users can observe the rate at which ML messages enter the system to predict data load size. Aashraya notes, "This feature assists us in easily testing queue-based array processing systems, which are becoming more prevalent."
Finally, Observe.AI integrated Amazon Simple Notification Service (Amazon SNS), a fully managed service for application-to-application and application-to-person messaging that helps OLAF users replicate specific patterns within Amazon SNS.
Outcome | Optimizing Costs and Boosting Developer Efficiency
Launched in 2022, OLAF by Observe.AI is now actively employed by dozens of ML engineers and researchers for testing and predicting data loads. By using OLAF, Observe.AI has cut LLM costs by conducting load tests on Amazon SageMaker instances, identifying the most suitable configuration aligned with the company’s business metrics. Aashraya explains, "Our research team encountered higher costs than anticipated when deploying an LLM, as well as other ML models, with specific latency and throughput requirements into production. However, through fine-tuning Amazon SageMaker instance sizes with OLAF while maintaining a constant data input load, we optimized costs for our ML models deployments by over 50 percent. This process ensured the best return on investment."
Previously, Observe.AI developers had to write multiple scripts and construct numerous pipeline workflows, resulting in a complex array of onboarding data transfers and debugging systems. Aashraya notes. "Because OLAF is tightly integrated with AWS, it now only takes developers a few hours to determine the proper instance for use, a task that used to take one week. As a result, developers can allocate more time to testing data loads and creating new features."
With the integration of OLAF, Observe.AI can scale its services to accommodate a tenfold increase in data load. The company can now conduct stress testing more easily and accurately, providing valuable assistance to customers who have augmented their data loads. Aashraya explains, "If a customer doubles their data load, we now have a clearer understanding of our infrastructure's capacity. Using OLAF and AWS, we can replicate and precisely increase the load by 100 percent, anticipating potential breakpoints or database issues. This not only helps us better prepare our customers for such scenarios but also brings internal cost and development benefits."
Learn More
To learn more, visit aws.amazon.com/ai/machine-learning/.
About Observe.AI
Observe.AI is a solution for boosting contact center performance through live conversation intelligence. Utilizing a robust 30-billion-parameter contact center large language model (LLM) and a generative AI engine, Observe.AI extracts valuable insights from every customer interaction. Trusted by companies, Observe.AI is a valued partner in accelerating positive results across the entire business landscape.
AWS Services Used
Amazon SageMaker
Amazon SageMaker is a fully managed service that brings together a broad set of tools to enable high-performance, low-cost machine learning (ML) for any use case.
Amazon Simple Queue Service
Amazon Simple Queue Service (Amazon SQS) lets you send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available.
Learn more »
Amazon Simple Notification Service
Amazon Simple Notification Service (Amazon SNS) sends notifications two ways, A2A and A2P. A2A provides high-throughput, push-based, many-to-many messaging between distributed systems, microservices, and event-driven serverless applications.
More Software & Internet Customer Stories
Get Started
Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.