Customer Stories / Software & Internet / United States

2023
Forethought Logo

Optimizing Costs and Performance for Generative AI Using Amazon SageMaker with Forethought Technologies

Learn how Forethought Technologies, a provider of generative AI solutions for customer service, reduced costs by up to 80 percent using Amazon SageMaker.

80% cost reduction

using Amazon SageMaker Serverless Inference

66% cost reduction

using Amazon SageMaker multi-model endpoints

Improved resource efficiency

and availability

Improved customer response times

and hyperpersonalization

Overview

Forethought Technologies (Forethought), a customer service software provider, wanted to improve its machine learning (ML) costs and availability as it gained new customers. The company was already using Amazon Web Services (AWS) for ML model training and inference and wanted to be increasingly efficient and scalable with its small cloud infrastructure team.

To achieve its goals, Forethought migrated the inference and hosting of ML models to Amazon SageMaker, which is used to build, train, and deploy ML models for virtually any use case with fully managed infrastructure, tools, and workflows. Using Amazon SageMaker, Forethought improved availability and customer response times and reduced its ML costs by up to 80 percent.

Two Businesspeople Examining Graph On Computer

Opportunity | Using Amazon SageMaker to Support More Customers at Lower Cost for Forethought

Forethought’s suite of customer service solutions is powered by generative AI, a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. At the center of Forethought’s product is its SupportGPT technology, which uses large language models and information retrieval systems to power over 30 million customer interactions each year. Through automation, the company reduces the load on customer support teams by assisting users with conversational AI. Many of Forethought’s customers use its product during busy periods, such as holidays or tax season, to handle more customer issues with fewer customer support agents. Forethought offers hyperpersonalized ML models for its customers, often training multiple models per customer to meet individual use cases.

Forethought was founded in 2017 in the United States and initially used multiple cloud providers to host its products, using Amazon SageMaker for training ML models. In its first 2 years, the company built a solution for its ML inference using Amazon Elastic Kubernetes Service (Amazon EKS), a managed Kubernetes service to run Kubernetes on the AWS Cloud and on premises. As the company continued to grow and gain new customers, it wanted to improve the availability of its solution and reduce costs.

To meet its scalability, availability, and cost-optimization needs, Forethought chose to migrate its ML inference to Amazon SageMaker, and the company began using additional features of Amazon SageMaker to improve its products. In this process, Forethought architected its pipeline to benefit from the latency and availability improvements that it could achieve using Amazon SageMaker. “From the Amazon SageMaker team and across the board, for anything that we need, they connect us with the right people so that we can be successful using AWS,” says Jad Chamoun, director of core engineering at Forethought.

kr_quotemark

By migrating to Amazon SageMaker multi-model endpoints, we reduced our costs by up to 66% while providing better latency and better response times for customers.”

Jad Chamoun
Director of Core Engineering, Forethought Technologies

Solution | Reducing Costs and Improving Availability Using Amazon SageMaker Inference

Forethought migrated its ML inference from Amazon EKS to Amazon SageMaker Model Deployment multi-model endpoints, a scalable and cost-effective solution to deploying large numbers of models. One example of this feature in action in Forethought’s solution is autocompleting the next words in a sentence when a user is typing. The company uses Amazon SageMaker multi-model endpoints to run multiple ML models on a single inference endpoint. This improves the scalability and efficiency of hardware resources such as GPUs. The company also reduced costs by using Amazon SageMaker multi-model endpoints. “Using Amazon SageMaker, we can support customers at a lower cost per customer,” says Chamoun. “By migrating to Amazon SageMaker multi-model endpoints, we reduced our costs by up to 66 percent while providing better latency and better response times for customers.”

Forethought also uses Amazon SageMaker serverless inference, a purpose-built inference option, to deploy and scale ML models without configuring or managing any of the underlying infrastructure. Forethought’s use of Amazon SageMaker Serverless Inference revolves around small models and classifiers that are fine-tuned to each customer use case, such as automatically determining the priority of a support ticket. By migrating some of its classifiers to Amazon SageMaker Serverless Inference, Forethought saved around 80 percent on related cloud costs.

The cloud infrastructure team at Forethought is a team of three people. Running and managing all the ML models and Kubernetes clusters was too much overhead for the small team. Using Amazon SageMaker, the company can scale as much as it wants with the people it has. “We run multiple instances within Amazon SageMaker multi-model endpoints,” says Chamoun. “We are able to share resources more efficiently while providing better availability than we did in the past.”

Using Amazon SageMaker, the Forethought team no longer has to worry about memory exceptions or availability, issues that the three engineers otherwise would have spent considerable time working on. Because the company set up the automated pipelines for language models using Amazon SageMaker, teams at Forethought and its customers can interface with the data that they want to train and submit it. “Not having to be involved as things are being trained, deployed, and scaled was key for us to work on other things that are more impactful for the company,” says Chamoun. Forethought now runs over 80 percent of its GPU inference on Amazon SageMaker between Amazon SageMaker multi-model endpoints and Amazon SageMaker Serverless Inference.

Outcome | Continuing to Provide Hyperpersonalization Using AWS

Forethought is continuing to grow and provide hyperpersonalized ML models for more customers. The company is still engaging AWS to improve its infrastructure and innovate its product. Forethought is part of the AWS Global Startup Program, an invite-only, go-to-market program supporting mid-to-late-stage startups that have raised institutional funding, achieved product-market fit, and are ready to scale. The company is getting the word out about its product, which is now on AWS Marketplace.

“Whether it’s our search services, our inference for specific ML models, or chatting with our customer support bots, everything we have uses Amazon SageMaker,” says Chamoun.

About Forethought Technologies

Forethought Technologies is a startup in the United States providing a generative AI suite for customer service that uses machine learning to transform the customer support life cycle. The company powers over 30 million customer interactions a year.

AWS Services Used

Amazon SageMaker

Build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows

Learn more »

AWS Global Startup Program

The AWS Global Startup program is an invite-only, go-to-market program supporting mid-to-late stage startups that have raised institutional funding, achieved product-market fit, and are ready to scale.

Learn more »

More Generative AI Customer Stories

Showing results: 5-8
Total results: 236

no items found 

  • Mindtickle Cracks the Code to Smarter Selling Using Generative AI on AWS

    Mindtickle is a revenue enablement platform that helps ramp sales teams quickly, expand customer accounts, and win big. The Mindtickle platform keeps teams up to speed with market changes and buyers' needs, combining training, content management, coaching, call insights, and digital sales rooms in one. Mindtickle uses generative AI on Amazon Web Services (AWS) to power Mindtickle Copilot, to help sales teams prep for meetings faster, engage with the modern buyer, and close deals more efficiently.
    2024
  • Spain

    Taptap Digital Accelerates Customers’ Advertising Success with Generative AI Using AWS

    Taptap Digital is an advertising technology company that leverages generative AI and machine learning on AWS to provide real-time predictive modeling capabilities. This enables their clients to achieve advertising success at scale by accurately targeting the right customers across multiple channels. AWS's high availability, global reach, and data protection tools are critical for Taptap Digital's operations, allowing them to process around 100,000 requests per second while maintaining compliance with data privacy regulations.
    2023
  • Swimming Australia Uses Data and Machine Learning on AWS to enhance Athlete Performance

    Swimming Australia, the national governing body for competitive swimming, is at the heart of a sport that is part of Australia’s DNA. With over 100,000 registered members across 1,100 clubs nationwide, the organization has built a legacy of creating world champions and driving Australia's Olympic success—amassing 239 medals, including 78 golds. Today, it remains committed to pushing the boundaries of athletic performance.
    2024
  • India

    Helping the Next Generation of Creators Build Six Million Videos Every Month

    Invideo is a gen AI powered video creation tool that makes video production easy and accessible for everyone, regardless of technical skills, creative expertise or budget.
    2024
1 59

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.