Unitary Scales AI Moderation to 26 Million Videos Daily with Amazon EKS
Discover how Unitary scaled their AI moderation solution, managing over 1,000 nodes while reducing costs by 50–70% with Amazon EKS and Karpenter.
Key Outcomes
50%–70%
reduction in overall costs over 18 months80%
reduction in container boot times for fast response100
million users served effortlessly across digital platforms1,000+
nodes managed effortlessly by 3 person team at peak trafficOverview
Software startup Unitary exists to make the internet safer, helping improve lives and create a fairer world. Built on Amazon Web Services (AWS), the company uses artificial intelligence (AI), specifically machine learning (ML), to help its customers with online content moderation. Given the daily surge of content, processing millions of images and videos, Unitary needed a scalable solution to manage growing demand.
On AWS, Unitary helps platforms with over 100 million combined users combat toxic content while minimizing the impact on human moderators. To meet this scale, Unitary chose to run its ML inference workloads on Amazon Elastic Kubernetes Service (Amazon EKS), a managed service for running Kubernetes on AWS.

About Unitary
Based in the United Kingdom, software company Unitary aims to make the internet safer. It builds solutions that blend human expertise and artificial intelligence for moderating user-generated content quickly, accurately, and cost-efficiently.

As a startup, we wanted to rapidly scale our product, so we needed to build a fast, flexible, and highly scalable platform. On AWS, we did just that.
Sasha Haco
CEO, UnitaryArchitecture Diagram
Unitary's customers submit content for assessment by generating an API request, which is received by a microservice running on Amazon EKS and fronted by a load balancer. The microservice assigns a job ID to the content and confirms to the customer that the request will be processed asynchronously. The job is then published to an Amazon SNS topic, which sends the message to a subscribed SQS queue. This queue holds details of all the content awaiting classification. This architecture enables error handling, and the queue depth is used as the basis for autoscaling the services running on EKS. The media processing microservice retrieves a job from this queue and assesses the type of content that has been submitted. If required, it breaks videos into frames and extracts the audio stream. The relevant inference microservices, which run Unitary's machine learning models, are then invoked on the different modalities of the content: the image/video frame (Triton inference server), any text extracted from the visuals (OCR server), and the video audio stream (audio inference server), as appropriate. The media processing microservice then aggregates the outputs from the inference microservices, classifies the content against customer content moderation policies, and summarizes the results. This summary includes a safety score or risk level for each policy category. The results summary is added to another SQS queue, via an SNS topic, to be sent back to the customer via a webhook. The results are also persisted into a Postgres database by the results writer microservice, acting as a system of record for the content classification outcomes.
Get Started
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages