Snap Inc. Uses Amazon EC2 G4 Instances to Deliver Bitmoji TV to Millions
In 2018, Snap Inc. (Snap), known for its Snapchat messaging app, had an intriguing new idea: create a series of animated videos starring each user’s Bitmoji, the personalized cartoon avatar that is Snapchat’s signature feature. Each week, Bitmoji TV would debut new episodes consisting of silly, professionally scripted and animated 3- to 8-minute videos in which Bitmojis of users and their friends were the stars, doing everything from fighting off zombies to competing in a low-gravity “Moonlympics.”
The technical challenge, however, was formidable; how would Snap seamlessly render each video incorporating users’ unique avatars without slowing down the experience and turning viewers away? Initially set to run these workloads with Amazon Web Services (AWS) through Amazon Elastic Compute Cloud (Amazon EC2) G3 Instances, Snap would end up crushing this challenge with next-generation Amazon EC2 G4 Instances.
With Amazon EC2 G4 Instances versus Amazon EC2 G3 Instances, we were getting a 50 percent boost for a 10 percent higher cost.”
Developing an Architecture for Personalized TV Shows
Since Snap acquired Bitstrips—later rebranded as Bitmoji—in 2016, the cartoon avatars have become a distinctive feature of the app. On average, more than 70 percent of Snapchat's daily active users have their Bitmojis linked to their Snapchat accounts. Snap began tinkering with the concept of user-personalized content with Bitmoji Stories, and in 2019 the company gave the green light to Bitmoji TV, tasking engineers and animators with overcoming the associated compute obstacles.
Snap’s initial testing of Bitmoji TV found that every second of video playtime required about 1 second of processing. This was a significant delay for a 3-minute video—and one that would drive most users elsewhere. Snap developers needed an efficient, cost-effective way to generate real-time graphics quickly and consistently. “The initial architecture was all CPU rendering and encoding,” says Snap software engineer Brad Kotsopolous. “It would have been expensive to stand up that many instances, so we were exploring some GPU acceleration technologies for our rendering system,” continues Kotsopolous.
Snap engineers worked with graphics processing company NVIDIA to develop a unique rendering system. “We were doing both the frame rendering and the video encoding on the GPU,” says Kotsopolous. “We found that it was really promising—at least a 10 times performance boost over the previous CPU implementation.” Bitmoji had run on the AWS Cloud since before being acquired by Snap, so the team was leaning toward Amazon EC2 G3 Instances, which deliver a powerful combination of CPU, host memory, and GPU capacity. But when AWS released the next-generation Amazon EC2 G4 Instances—which provide the latest generation NVIDIA T4 GPUs—Snap knew it wanted to experiment with the new technology.
Finding the Sweet Spot with Amazon EC2 G4 Instances
The key to creating Bitmoji TV’s real-time rendering of unique videos was finding a balance between power and price. “We benchmarked pretty much all of the Amazon EC2 G4 Instance sizes,” says Chirag Gada, also a software engineer at Snap. “So starting from Amazon EC2 G4 large and then up to XL, to 4XL, even 8XL. We were trying to find the sweet spot on the maximum throughput,” notes Gada. Snap found that sweet spot at 4XL.
Snap also worked with AWS to arrive at the right instance configuration and set up the framework required to render user-specific videos. The avatars were all stored in Amazon DynamoDB, a key-value and document database that delivers single-digit millisecond performance at any scale; and the video frames were stored in Amazon Simple Storage Service (Amazon S3), an object storage service that offers industry-leading scalability, data availability, security, and performance. “For every episode and for every segment of that episode,” says Kotsopolous, “we needed to pull data from Amazon DynamoDB portraying what a particular user looks like and from Amazon S3 looking at how to position that user and that particular frame; then we put it all together and passed it to the renderer. We pushed them into Amazon S3, and then we served them to the user through Amazon CloudFront”—a fast content delivery network service that securely delivers data, videos, applications, and APIs to customers globally with low latency and high transfer speeds.
Amazon EC2 G4 Instances offered more than 10 times performance improvements compared to CPU-based solutions to meet the massive demand of new video releases that would reach millions of unique users per day. In fact, with Amazon EC2 G4 Instances, 1 second of video now renders in only about 100 ms.
Proving the Benefits of Amazon EC2 G4 Instances
With Amazon EC2 G4 Instances implemented, Snap saw peak uses of 1,500–3,500 GPUs when a given episode of Bitmoji TV aired each Saturday morning, with a floor of around 100 GPUs at low-traffic times during the week. Handling this kind of workload on premises would have been prohibitively expensive, notes Kotsopolous: “These GPUs themselves are over $1,000. If we had to buy all the hardware, it would be $3 million.” Amazon EC2 Instances enabled Snap to optimize costs by paying only for what it uses—spinning up and spinning down instances as needed—and the Amazon EC2 G4 Instances in particular turned out to provide the right balance of cost and performance.
“With Amazon EC2 G4 Instances versus Amazon EC2 G3 Instances,” says Kotsopolous, “we were able to get a 50 percent boost in how much traffic we could handle at the same time. The instances themselves cost a bit more. But we were getting a 50 percent boost for a 10 percent higher cost”—a cost savings of 36 percent. This boost in traffic capacity, together with a 45 percent decline in latency, meant that Bitmoji TV viewers were much less likely to see the dreaded spinning wheel.
Making Users the Stars of the Show
With Amazon EC2 G4 Instances, Snap was able to create a fun, bold kind of personalized entertainment that rendered quickly enough for millions to enjoy simultaneously while keeping compute costs down. The tenth and final episode of Bitmoji TV season one aired on April 4, 2020, leaving Snap excited to build on its success. “Bitmoji TV tweets seem to be overwhelmingly positive,” says Kotsopolous. “A lot of people are talking about how they enjoy it.” After all, who doesn’t want to be the star of their own show?
About Snap Inc.
Founded in 2011, Snap Inc. is a social media company that empowers people to live in the moment. Its flagship app, Snapchat, has over 200 million global users. The company acquired Bitmoji in 2016, integrating its trademark cartoon avatars into the Snapchat platform.
Benefits of AWS
· Boosted traffic capacity by 50%
· Enabled a 45% reduction in latency
· Enabled easy workload scalability from 100 instances to 3,500
AWS Services Used
Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.
Amazon EC2 G4 Instances
Amazon EC2 G4 instances deliver the industry’s most cost-effective and versatile GPU instance for deploying machine learning models in production and graphics-intensive applications.
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It's a fully managed, multiregion, multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications.
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.