AWS Partner Network (APN) Blog
How PFT CLEAR Vision Cloud Leverages Amazon EKS for Optimized and Sustainable AI Processing
By Prathap Simha, Sr. Technology Architect – Prime Focus Technologies
By Aditya Jha, Product Leader and VP – Prime Focus Technologies
By S. Jayasundar, Sr. Partner Solution Architect – AWS
By Shad Hashmi, M&E Partner Lead, APJ – AWS
Prime Focus Technologies |
Prime Focus Technologies (PFT) is an AWS Partner and creator of the Supply Chain Automation platform CLEAR for media and entertainment (M&E) customers.
PFT’s artificial intelligence (AI) module, CLEAR Vison Cloud, enables streaming platforms, studios, and broadcasters to automatically process, manage, and monetize their content supply chain using cloud-based AI and machine learning (ML) workflows and media services.
CLEAR generates over 1.2 million AI actions annually, accelerating speed to market, enhancing accuracy, and increasing overall supply chain efficiencies. It manages 40PB of content and processes more than 1.5 million hours of content annually for leading media studios, broadcast, over-the-top (OTT), and sports agencies.
In this post, PFT and Amazon Web Services (AWS) discuss how AI/ML workflows are deployed at scale to meet service-level agreements (SLAs) for different use cases across the media supply chain.
The solution leverages a blend of AI/ML models deployed using Amazon Elastic Kubernetes Service (Amazon EKS) with a zero-node design. This model results in a >90% increase in compute efficiency and processor utilization, while maintaining customer SLA-mandated delivery timelines.
Content Processing Challenges
Content is being consumed across geographic boundaries in higher volumes than ever before, and is being delivered to platforms such as connected TVs, direct-to-consumers applications, and traditional linear television, each with its own mode of monetization.
This leads to increasing demand for media supply chain solutions for segmentation, assisted promo-creation, audio/visual (AV) conformance, localization (subtitle creation), foreign-language mastering, and compliance, amongst many other use cases.
Solutions that address these use cases require a customer is correctly matched to an optimal AI/ML model, and regular updates are essential as models learn and are optimized. This is a scaling, deployment, and operational challenge, as each AI/ML model has different compute capabilities and requires its own virtual machine deployed across multiple nodes.
Many media AI/ML services are typically bound to a graphics processing unit (GPU) or use CPU and are memory-intensive workloads with variable run-times (ranging from two minutes to multiple hours).
Customer demand for content processing is not regularly distributed over a period of time, leading to processing peaks. However, SLAs often demand predictable processing times irrespective of content arrival volumes. For example, a 60-minute asset requiring segmentation must be processed and delivered within three minutes with 95%+ accuracy in frame-level detection.
This requires solutions to scale up during content bursts. A strategy based on over-provisioning a fleet of servers, at peak or even forecast average capacity, is sub-optimal and not environmentally sustainable. Based on PFT’s data analysis, this can lead to under-utilization in almost all scenarios with a commercially unviable per-minute processing cost for media, and hence a corresponding lower adoption rate by enterprise media customers.
It is critical that AI/ML solutions continually deploy as they are enhanced, efficiently match to customer requirements, and that processing engines are provisioned and scaled optimally to manage cost-per-minute of media and turnaround times.
The CLEAR Solution
PFT’s CLEAR Vision Cloud AI platform is purpose-built for the M&E industry. The core application runs on AWS and solves for media use cases across the supply chain.
CLEAR Vision Cloud integrates best-in-class AI engines like Amazon Rekognition’s celebrity recognition feature. It has 40+ native PFT proprietary AI models, which also run on AWS, and a unique machine wisdom layer that is focused on harnessing the best quality data. Machine wisdom synthesizes data generated by multiple AI engines to provide contextual and actionable information.
PFT leveraged Amazon EKS, with inputs from AWS Solutions Architects, to deploy a zero-node design that delivers optimal processing cost-per-minute with a deterministic delivery window.
Zero-Node Design on Amazon EKS
All of PFT’s AI/ML workflows are containerized and operate as microservices. They are deployed as separate Kubernetes batch jobs with associated node pool profiles. These batch job nodes are labelled inside an Amazon EKS cluster, but the node pools are not instantiated. This allows for managed scale-out on load using a stateless architecture.
Figure 1 – PFT zero-node deployment architecture.
A single-node custom job scaler is deployed in a multi-zone availability model. It’s configured with kubectl and runs as a pod (a light-weight virtual machine) inside the same Amazon EKS cluster.
The job scaler is a stateless service which fetches jobs from the database queue. This stateless job scaler replaced two dedicated, under-utilized nodes that catered to incoming requests. The stateless scaler also automatically manages peak traffic where, previously, manual scale-out and scale-back intervention was required.
The stateless scaler regularly polls the database for the number of jobs waiting on an AI/ML microservice. Based on the job count from the database, the job scaler increases the parallelism of the batch job for the particular microservice. A pod is spun up in pending state with no corresponding active node.
The Kubernetes auto-scaler kicks in, and provisions a node for the batch job pod. Amazon EKS ensures the node pools are spun up on the fly based on compute requirement specified in the configuration. The batch job then pulls a specific record from the database, independent of the job scaler, and locks the record to avoid race condition.
Once the job is complete, the job scaler reduces the parallelism back (potentially to zero) as determined by the job queue. There is a currently less than 10% overhead (due to the wait time of 15 minutes configured in Amazon EKS), for scale-in on job completion. This wait time is being continually monitored and will be optimized for better efficiency.
Conclusion
PFT’s CLEAR Vision Cloud AI platform provides an ultra-efficient deployment and scaling mechanism for a portfolio of AI/ML models based on customer workloads.
The platform matches customer delivery volume with compute infrastructure, exceeds customer delivery expectations, and delivers a product at a commercially viable processing cost per minute of media. This allows enterprise media customers to deliver globally and monetize their content on multiple platforms.
PFT has seen compute utilization of over 90% using Amazon EKS while reducing job overhead to <10% with zero-node deployments, leading to these benefits:
- Running AI workloads faster, at optimized costs even when utilizing large instance sizes.
- Just-in-time scaling out and in to support content bursts and optimizing compute utilization.
- Significantly reduced development, deployment, and support overhead.
- Optimal compute utilization, leading to sustainable operations and a lower carbon footprint by reducing idle time.
To learn more, check out PFT’s offerings on AWS Marketplace.
Prime Focus Technologies – AWS Partner Spotlight
Prime Focus Technologies (PFT) is an AWS Partner that built and operates the enterprise resource planning software CLEAR for media and entertainment customers.