AWS for Industries
How Frontier Communications uses Amazon Bedrock to transform customer insights from call transcripts
Introduction
Frontier Communications is the largest pure-play fiber provider in the United States with 3.2 million broadband subscribers across 25 states and over 12,000 employees. The Retention & Loyalty group ensure our customers remain delighted by optimizing the customer journey, reducing churn, and improving Lifetime Value (LTV). As data scientists, we use the Amazon Web Services (AWS) environment to analyze customer interaction data and deliver actionable insights. To that end, we’ve been developing a generative AI application on AWS to analyze call center transcripts called Giga-T.
The challenge
Each month, Frontier call center agents for Retention & Loyalty handle 125,000 calls, talking for 25,000 hours, and uttering 250 million words. This was an untapped data resource that could communicate the literal voice of the customer to management in an unprecedented way. Natural Language Processing (NLP) as a means of analyzing unstructured text coudn’t achieve the minimum accuracy thresholds that we wanted. Furthermore, it was both laborious and inflexible to implement. Therefore, we decided to try the emerging technology of large language models (LLMs) to analyze our call transcripts.
We started with one canonical question for each and every single transcript: Why was the customer calling?
We broke call intent into 50 Mutually Exclusive, Collectively Exhaustive (MECE) classifications. Each call intent differed in its combination of root cause, effect on KPIs, and mitigation measure.
Data on this canonical topic would means that we can answer derivative evergreen questions for the business:
- What’s the relative mix of reasons for cancelling and how does it change over time?
- Are there any temporal spikes for certain cohorts of customers?
- Are some reasons for cancellation more difficult to resolve than others for agents across skill levels?
- Which agents have the highest and lowest persuasion rates for specific cancellation reasons?
- Which combination of agent and non-cancellation intent have the lowest first time call resolution rates?
- What is the kurtosis of the distribution of non-cancellation intent calls across agents?
Self-reported call intents by the agent or customer can be unreliable for any number of reasons. Sometimes agents don’t systematically or correctly record the call intent. Some customers express their call intent to the IVR in overly broad or unintelligible ways. Two agents in good faith can interpret call intent codes with two different meanings. The meaning of the call intent code can change over time without documentation. What do you do when the call intent appears to be one thing at the beginning of the call but it turns out to be something else midway through?
Human review of all calls is impractical. It would take 10 months for a human to listen to and determine the call intents for 6,000 transcripts (the daily call volume for Retention & Loyalty agents). Taking a tiny random sample of calls to audit has unacceptably high statistical error rates: both falsely identifying patterns that don’t exist (Type I errors) and missing real patterns that do exist (Type II errors). There’s no point in advocating for operational changes when there’s neither statistical nor clinical significance to the analysis. However, taking a sufficiently large sample size would be both time consuming and expensive.
Solution
Initial setup for Giga-T
Our data science team benefited from having intrapreneurial upper management who encouraged us to take calculated risks, start small, fail fast, and iterate quickly. Our journey with Giga-T exemplifies this approach, evolving from a simple proof-of-concept to a sophisticated, scalable solution, as shown in Figure 1 below.
We began with a clear setup:
- Amazon Simple Storage Service (Amazon S3) to store call transcript text files as inputs and classification results as Comma Separated Value (CSV) outputs.
- Amazon Sagemaker AI instance storage to manage prompt text files for directed acyclic graph (DAG) based prompt chaining.
- Amazon Bedrock to access the Anthropic Claude family of foundation models (FMs).
- Amazon SageMaker Jupyter notebooks to:
- Merge a transcript and prompt into the system and user messages
- Set up prompt chaining DAG
- Make API calls to Amazon Bedrock for analysis at each node in the DAG
- Store the results within a CSV in Amazon S3
- Loop to the next transcript
- AWS Cost Explorer to track the API costs incurred by the analysis
Figure 1: Architecture: Initial architecture of Giga-T
We used this architecture to start processing transcripts immediately and forecast costs for full production. The skunkworks project yielded insights on analysis that were previously impossible to achieve, thus proving the concept’s value without getting bogged down in complex infrastructure design.
Fuller implementation in 2024
Our phased approach allowed for continuous improvement throughout 2024 and 2025.
January to March 2024: Proof of concept
- Human audited 900 transcripts for 50 MECE call intent classifications
- Achieved 70%+ on both precision and recall rates for each classification in the eval test
- Successfully classified 20,000 calls with Giga-T within AWS
- Produced data visualizations on the derivative evergreen questions about calls, customers, and agents
April to June 2024: Small-scale production
- First generative AI application within the company formally approved for production status
- Started analyzing a random sample of 1,000 Retention & Loyalty calls each day
- Added prompt for call resolution with 95%+ recall and precision on eval test
July to December 2024: Full-scale production
- Scaled Giga-T to analyze all 6,000 Retention & Loyalty calls each weekday
- Added prompts to measure 15 other aspects of transcripts (for example measures of agent compliance)
- Set up business intelligence dashboards
- Integrated into call center operations reporting flywheel to provide weekly feedback on agent performance
- Further automated backend processes
The enhanced architecture built upon our initial foundation while adding sophisticated components:
- Amazon S3 continued to serve as the primary storage for raw call transcripts
- Amazon EventBridge was introduced to orchestrate automated workflow triggers
- AWS Step Functions managed the complex orchestration of the analysis pipeline
- AWS Lambda functions and AWS Glue Jobs were implemented for transcript processing and data transformations
- Amazon Bedrock remained core to powering the LLM-based analysis through Claude Sonnet 3.5 v2 model
- Our existing enterprise data platform, including Databricks Unity Catalog for metadata storage and Microsoft Power BI for visualization, was integrated with the AWS architecture
The workflow begins with new transcripts being uploaded to Amazon S3 and triggering an EventBridge rule. This initiates a Step Functions workflow that coordinates multiple Lambda functions to:
1. Pre-process and validate the transcripts
2. Run the LLM analysis through Amazon Bedrock using our refined prompts
3. Store results in Databricks Unity Catalog
4. Generate automated reports and visualizations
Figure 2: Serverless architecture for Giga-T
This serverless architecture, shown in the preceding figure, enabled automatic scaling to handle varying call volumes while maintaining consistent performance. Further monitoring and logging through Amazon CloudWatch provided operational visibility and helped maintain quality control.
Progress in 2025
In 2025, Giga-T continued to improve with more back-end automation, more data collected, more insights revealed, more reports generated, and more impact on call center KPIs. Throughout this process, the AWS environment has proven its ability to effortlessly scale up while meeting Service Level Agreements (SLAs) and other enterprise-grade requirements. We trained 14 other prompt engineers in the company on how to use Amazon S3, Amazon Bedrock, and Sagemaker so that they could extend Giga-T to other call center operations in Sales, Business-to-Business (B2B), Tech Support, and Customer Service. Overall, Frontier call centers handle 500,000 calls and involve 1 billion uttered words each month.
Beyond infrastructure scalability, stability, and flexibility, the people at AWS really partnered with us and played a critical role in empowering our data scientists to develop Giga-T autonomously. To get past the initial barriers of grasping threshold concepts of developing generative AI applications within AWS, the two Experience-Based Accelerator (EBA) workshops they conducted really helped us both times. One focused on hitting escape velocity using Amazon Bedrock for the first time to analyze a transcript. The other resulted in our first successful automation of a backend Giga-T process using Step Functions and Lambda functions. Furthermore, semi-monthly check-ins with our client executive and solutions architect got us “unstuck” various times and taught us new techniques (for example requesting increases in backend rate limits, cross-Region inference, batch inference, prompt caching, and application inference profiles).
Results
Giga-T helps us understand more deeply and more quickly what is going on. Whereas it would take a person 10 months to listen to and analyze 6,000 calls, Giga-T can analyze those same calls within six hours. That’s a 416x faster analysis with a 98% reduction in cost per transcript analyzed. The insights and subsequent operational adjustments generated by this new flow of data helped achieve a record high save rate of 83% in 2024 within the Retention & Loyalty group.
We’ve used a reporting flywheel between Giga-T and call center operations and observed average compliance rates on new policies in one case increase linearly from 20% to 80% within weeks across all centers.
We’ve used a blend of generative AI and traditional statistical techniques to identify what the best agents say and ask that the worst agents don’t. We did this by using Giga-T in Amazon Bedrock to parse transcripts using linguistic speech act theory. Then, we processed those results with k-modes clustering and logistic regressions in SageMaker.
Clearer visibility into the full semantic context of the call means that we can measure agent performance in a way that’s both more fair for the agent and more aligned financially for the business. For example, not every call received by an Inside Sales agent is a viable, sellable opportunity, and a non-trivial percentage of their calls aren’t even about sales. Sales conversion should use sellable opportunity calls as the denominator, not all calls. But you need generative AI capabilities such as Giga-T to properly identify which calls are sellable opportunities and which aren’t.
In one impromptu analysis, we conducted a voice of the customer analysis telling us the most common complaints and questions received after Hurricane Helene to make real-time adjustments to call center policy to better address customer needs in that time of crisis.
Key learnings
We have three pieces of advice for other data scientists on their generative AI journey:
Start-up mentality: Generative AI is an emerging technology that is rapidly evolving and no one really knows exactly when it will slow down or where it will end up. There’s no standard textbook on best practices. You can’t hire someone with decades of experience implementing practical generative AI applications in an enterprise environment because they don’t exist. The only way to really learn in this case is to just try and see what happens. Don’t be afraid to start small, fail fast, iterate quickly, and turn on a dime. Having intrapreneurial upper management like we did was invaluable.
Human-in-the-loop approach: Project success for generative AI needs more human participation when developing inputs and evaluating outputs than you might think. Context is crucial, so get your hands dirty in prompt engineering best practices. LLMs don’t have the offline context of what things mean when stakeholders discuss the results on a Zoom call or how their answers will be used downstream to affect processes and people. The ideal prompt engineer is an excellent written communicator with strong domain expertise. LLMs can be fickle intelligences and seemingly trivial changes in wording and structure within the prompt can dramatically change the accuracy of the outputs. Furthermore, the more the downstream consequences of generative AI affect real people’s lives, the more important it becomes to conduct evaluation tests based on human-determined sources of truth with statistical measures (for example precision, recall, and Cohen’s Kappa).
Partnering with AWS domain experts: The AWS environment is epic in scope, both sublime in potentiality while intimidating in labyrinthine detail. Your AWS solutions architects see the full range of complications that arise across all their clients. They know the best-in-class solutions and the latest information on AWS strengths, limitations, and forthcoming improvements. Collaborating with them in EBA workshops and regular check-ins will yield incredible dividends in getting your generative AI applications on AWS to move the needle within your company.
Getting started
If you’re interested in building a similar solution for your organization, then here are some resources to help you get started:
Learn more about Amazon Bedrock and available FMs:
Explore the AWS services we used:
- Amazon Bedrock
- AWS Step Functions
- AWS Lambda
- Amazon S3
- AWS Glue
- Amazon EventBridge
- Amazon CloudWatch
- Amazon Sagemaker
Engage with AWS experts:
- Learn about the AWS Experience-Based Accelerator (EBA) program
- Contact your AWS account team to discuss your use case
- Join the AWS Builder Community to connect with other builders
Access sample resources:
About Frontier Communications
Frontier Communications (NASDAQ: FYBR) is a Fortune 500 telecommunications provider offering internet, TV, and phone services across the United States. To learn more, visit Frontier.com.
