Benefits
Overview
For streaming services with millions of global subscribers, 1 minute of downtime can mean lost revenue and eroded trust. Warner Bros. Discovery (WBD), a leading global media and entertainment company, needed to modernize its observability infrastructure to support the rapid growth of its streaming audience. By building a custom observability solution on Amazon Web Services (AWS), WBD unified its monitoring capabilities into a single source of truth, delivering consistent performance and reliability at a massive scale.
About Warner Bros. Discovery
Warner Bros. Discovery (WBD) delivers storytelling across streaming services, television networks, and film studios. With brands including HBO, Discovery Channel, Warner Bros., and CNN, WBD reaches audiences in over 220 countries and territories.
Opportunity | Using AWS to create an observability foundation for WBD
In 2022, Discovery and WarnerMedia (including Warner Bros., HBO, and Turner Broadcasting) merged to form WBD. The company’s streaming services evolved with different technology approaches, each already running on AWS. Internally, WBD developed BOLT (Best of Legacy Technology) to establish operational excellence across its global streaming properties.
Observability—the practice of understanding system health and troubleshooting issues—evolved differently across each legacy brand. Engineers in different teams used a variety of unintegrated third-party tools. When issues arose, teams needed to check multiple systems to piece together the full picture.
To provide a seamless experience for millions of subscribers, WBD required a unified approach that could support both scale and speed. The system would be responsible for processing billions of metrics and logs while remaining cost-effective and operationally efficient.
Solution | Creating a single source of truth for monitoring
Working with the AWS team, WBD set out to create a solution that would serve as a single source of truth for all engineering teams. At the core of the metrics infrastructure, WBD deployed Amazon Managed Service for Prometheus, which provides highly available, secure, and managed monitoring for containerized systems. This service acts as the primary engine for collecting, storing, and querying performance data from containerized applications—including streaming quality metrics, backend service health, and infrastructure usage.
WBD designed an architecture that spanned North America, Europe, and Asia Pacific, with three AWS Regions per continent and three Availability Zones per Region for redundancy and resilience. Metrics queries needed to span all Regions simultaneously so that the company could get a complete picture of the platform’s global health and quickly diagnose issues. To aggregate data and present a unified view, WBD used promxy as a proxy layer along with the open source solution Grafana.
For log management, WBD created a log aggregation pipeline using Amazon Data Firehose, which reliably loads near real-time streams into data lakes, warehouses, and analytics services. WBD uses this service to stream log data from microservices and client devices into Amazon OpenSearch Service, a managed service that lets developers run and scale OpenSearch clusters. Like the metrics infrastructure, the logging system uses a Multi-AZ and multi-Region deployment. The system also includes an aggregation layer that is powered by the cross-cluster search capability in Amazon OpenSearch Service to consolidate data for searching and analysis.
WBD also uses Amazon CloudWatch, a service for observing and optimizing workloads at nearly any scale. This way, the company can collect metrics from AWS services and export them to Amazon Managed Service for Prometheus. The process creates consolidated dashboards that display both AWS service metrics and custom application metrics side by side. To make that infrastructure accessible to hundreds of engineers, WBD implemented a self-service model using infrastructure-as-code principles, empowering engineers to define dashboards and alerts through code repositories. WBD also built a custom cost-attribution system that tracks usage by individual business services, helping teams optimize their observability footprint.
Outcome | Achieving operational excellence by using AWS
By consolidating its observability infrastructure on AWS, WBD transformed how its engineering teams monitor and maintain streaming services at a global scale. The unified solution reduced the number of observability systems by 50 percent. Engineers can now troubleshoot issues faster without switching between different tools or collecting data from multiple sources.
“Using AWS, we’ve improved our cost efficiency and operational excellence significantly,” says Hans Robert, director of observability for the direct-to-consumer function at WBD. “We have 30 percent more customers than we had when we launched HBO Max, and we can seamlessly support them all.”
The multi-Region deployment provides the required resilience and scale to handle massive traffic spikes during major events, such as show premieres and live sports broadcasts, without performance trade-offs. WBD is also exploring the use of AI to help tech teams automate root-cause analysis, further reducing the manual effort to sift through observability data.
“In observability, you get woken up at all hours—not because something is wrong with the business-facing system but because your observability solution is too complicated to manage,” says Robert. “Now, that complexity is gone.”
Using AWS, we’ve improved our cost efficiency and operational excellence significantly. We have 30 percent more customers than we had when we launched HBO Max, and we can seamlessly support them all.
Hans Robert
Director of Observability for the Direct-to-Consumer Function, Warner Bros. DiscoveryAWS Services Used
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages