Customer Stories / Software & Internet / Turkey
Codeway Saves 48% on Compute Costs for Generative AI Using Amazon EC2 G5 Instances
Learn how Codeway optimized price performance for its generative AI application, Wonder, using NVIDIA GPU-powered Amazon EC2 G5 Instances.
With over 140 million users in more than 160 countries, Codeway has made a significant impact in the world of mobile applications and games through the power of generative artificial intelligence. As its user base grew, Codeway sought to improve the scalability, elasticity, and cost efficiency of the workloads that underpin this powerful technology.
After receiving recommendations from Amazon Web Services (AWS), Codeway chose to adopt Amazon Elastic Compute Cloud (Amazon EC2) G5 Instances powered by NVIDIA A10G Tensor Core GPUs, high-performance GPU-based instances for machine learning and graphics-intensive applications, to power its image-generation app, Wonder. By optimizing Wonder’s infrastructure on AWS, Codeway has maintained optimal performance, reduced costs compared with its previous compute strategy, and scaled effectively to help millions of content creators bring their ideas to life.
Opportunity | Scaling Compute for Generative AI while Lowering Costs for Codeway
Based in Istanbul, Turkey, Codeway develops mobile applications and games powered by cutting-edge technologies, particularly generative AI. Its Wonder application turns words into digital images; users enter words or sentences, and Wonder transforms those inputs into artwork by deploying stable diffusion models for image generation based on PyTorch on AWS. Depending on their subscription, users can then download a high-quality or low-quality version of the image.
Because Wonder has been downloaded by more than 28.3 million users, Codeway strives to maximize its compute and GPU capabilities. Wonder’s infrastructure is distributed across various cloud providers in multiple regions. For artificial intelligence (AI) inference workloads, Codeway was using NVIDIA A100 Tensor Core GPUs hosted on one of these providers. However, it encountered GPU capacity issues that affected performance.
“These workloads require very GPU-intensive hardware. We’re also adding millions of users every month, so our demand for GPUs will only increase,” says Ugur Arpaci, lead DevOps engineer at Codeway. “As we transition from managing hundreds of GPUs to thousands, we wanted to optimize for cost and performance and find a good strategy for scalability.”
Amazon EC2 offers a broad and deep compute portfolio, with over 600 instances and a choice of the latest processor, storage, networking, operating system, and purchase model options to help customers best match the needs of their workloads. While Codeway was searching for ways to optimize its compute, it discovered an ideal solution: Amazon EC2 G5 Instances powered by NVIDIA A10G Tensor Core GPUs. Although Codeway had a choice of similar GPUs with other cloud providers, they did not provide the same availability and scalability as AWS.
“The AWS team suggested that we could meet our price-performance goals by adopting Amazon EC2 G5 Instances powered by NVIDIA A10G Tensor Core GPUs,” says Arpaci. “We started to test this out, and we saw good results.”
On AWS, we can segment our workloads to provide better performance for our users.”
Lead DevOps Engineer, Codeway
Solution | Running PyTorch-Based Stable Diffusion Models for Wonder on AWS within 3.5 Months
After analyzing the price performance of Amazon EC2 G5 Instances, Codeway worked closely alongside the AWS team to complete the onboarding process. “We were always in contact with the experts at AWS,” says Arpaci. “We followed their guidance and then performed tests and calculated costs on our side. For certain models, we realized that we could gain the most benefits by deploying our application on Amazon EC2 G5 Instances. We then shared our results and established a very positive feedback loop.”
The onboarding process was quick and seamless, and within 3.5 months, Codeway was running production workloads for Wonder on AWS. It now uses Amazon EC2 G5 Instances with A10G GPUs to deploy nearly all the AI inference workloads for the free version of Wonder. To generate full high-definition images for paid subscribers, Codeway uses the more powerful A100 GPUs, which generate higher-quality content in a shorter amount of time. By using A10Gs and A100s, the company can adhere to all its service-level agreements for output times.
“We knew that the A10Gs were less powerful than the A100s, but some workloads don’t require as much GPU performance,” says Arpaci. “Now, we can off-load a lot of these workloads from our more powerful GPUs, which now work only on premium user features, such as high-quality image generation.”
To further enhance cost efficiency and performance, Codeway has adopted clusters on Amazon Elastic Kubernetes Service (Amazon EKS)—a managed service to run Kubernetes in the AWS Cloud and on-premises data centers—to dynamically spin Amazon EC2 G5 Instances up and down as required. A custom automatic-scaling solution has been deployed on each Amazon EKS cluster, which intelligently requests additional instances when the demand arises.
To manage instances, Codeway relies on Karpenter, an open-source node-provisioning solution. This service effectively determines and uses the appropriate instance types based on Codeway’s needs. “Karpenter actually selects the required number of instances for us and deploys them, and then we deploy the required workload on top of that,” says Arpaci. “The entire process is automated, which simplifies a lot of factors from an operational perspective.”
Outcome | Reducing Compute Costs by 48% to Effectively Scale Generative AI
The adoption of A10G GPUs featured in Amazon EC2 G5 Instances has been instrumental in Codeway’s journey toward a more cost-efficient, robust, and scalable architecture. The company can effectively scale to meet spikes and dips in usage, responding to the demands of users around the world. Now, millions of Wonder users enjoy an enhanced experience with applications and games.
“With Amazon EC2 G5 Instances powered by NVIDIA A10G Tensor Core GPUs, we can process a large subset of our AI inference workloads,” says Arpaci. “By using A10G GPU accelerators on AWS, we can segment our workloads to provide better performance for our users.”
On AWS, Codeway maintains high performance and availability at an optimal cost. By rightsizing Amazon EC2 G5 Instances and taking advantage of Amazon EC2 Spot Instances, which run fault-tolerant workloads for up to 90 percent off compared to On-Demand prices, the company reduced its compute costs by 48 percent compared to running all its workloads on A100 GPUs. Wonder’s free version aims to convert users into paid subscribers; by lowering compute costs for the free offering, Codeway can acquire more subscribers at the same price point.
Looking forward, Codeway will use AWS services to remain at the forefront of generative AI. It plans to deepen its engagement with AWS in the future and adopt new services to power other components of its infrastructure. For example, Codeway is evaluating several AWS services, such as AWS Batch—a service that facilitates batch processing, machine learning model training, and analysis at scale—to standardize its AI training workloads.
On AWS, Codeway has made big advances toward successfully productizing generative AI. Thanks to this transformative journey, its adaptable and resilient AI framework is ready to support its growing user base.
Headquartered in Istanbul, Turkey, Codeway launches mobile applications powered by generative artificial intelligence and other cutting-edge technologies. Since 2020, over 140 million users in more than 160 countries have downloaded its applications.
AWS Services Used
Amazon Elastic Compute Cloud (Amazon EC2) offers the broadest and deepest compute platform, with over 700 instances and choice of the latest processor, storage, networking, operating system, and purchase model to help you best match the needs of your workload.
Amazon EC2 G5 Instances
Amazon EC2 G5 instances are the latest generation of NVIDIA GPU-based instances that can be used for a wide range of graphics-intensive and machine learning use cases.
Amazon Elastic Kubernetes Service (Amazon EKS) is a managed Kubernetes service to run Kubernetes in the AWS cloud and on-premises data centers.
AWS Batch lets developers, scientists, and engineers efficiently run hundreds of thousands of batch and ML computing jobs while optimizing compute resources, so you can focus on analyzing results and solving problems.
Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.