PyTorch on AWS

PyTorch on AWS Customers

Nec

NEC is developing various AI technologies, including biometric identification, image recognition, video analysis, language modeling (Generative AI), optimal planning and control. NEC also ranks 10th worldwide in the number of adoptions at international conferences for machine learning and possesses high R&D capabilities. For these advanced AI technology developments, researchers train using PyTorch on NEC's largest AI supercomputer in Japan.

NEC uses a dedicated AWS environment located within NEC to infer a diverse set of PyTorch AI models trained on the AI supercomputer. Specifically, we utilize AWS SageMaker and Triton Inference Server to perform secure and on-demand AI inference. This allows us to operate and evaluate many PyTorch AI models in a secure environment in a low-cost and flexible environment.

Kitano Takatoshi, Director, Global Innovation Strategy Department, Deep Learning Platform Development Group

Aillis

Aillis is a medical company that focuses on development, manufacturing, and distribution of pharmaceuticals, medical devices and regenerative medicine products in Japan and internationally. They are using Artificial Intelligence (AI) technology to develop medical devices for medical institutions and doctors.

Our first product, nodoca™, is an AI camera that uses a diagnostic AI model to predict influenza from pharyngeal images and other clinical information. The AI camera is connected to Amazon EC2 P3 instances via Wi-Fi and will expand the capabilities of telemedicine. Recently, we obtained approval to market this as a medical device in Japan. nodoca will be launched in December 2022 and is reimbursable through the national health insurance program. Our model is constructed with a large-scale ensembled AI powered by dozens of models. Initially, we faced problems with inference time and reliability on edge devices. Using PyTorch on AWS, we achieved ten times faster inference than with edge devices, even accounting for the network communication time. We are also constructing a secure database of over 500,000 pharyngeal images with Amazon S3 instances to improve our AI model continuously. We thank AWS and the PyTorch community for their unwavering support in constructing a high-performance, reliable, and secure PyTorch model server and database for medical use.

Wataru Takahashi, AI & Patent Engineer

SB C&S Corp.

We inherited the IT distribution business, which is where it all began for the SoftBank Group, and continue to swiftly generate new business models in response to changes in the market environment. For corporate clients, we provide product solutions by leveraging advanced technologies, including cloud computing and artificial intelligence, through the largest sales network in the country.

AIMINA is an AI Platform service that enables non-specialist users to easily learn, build and test various ML models. By running our PyTorch based BERT (Transformer) models supported on AIMINA, used for building chatbots, on AWS Inferentia based Amazon EC2 Inf1 instances, we immediately saw a 2x improvement in throughput and 3x reduction in cost compared to Amazon EC2 G4dn instances. Compiling models for Inferentia using the AWS Neuron SDK was quick and easy, and we look forward to migrating our other models running on GPU-based instances to Inf1 instances in the near future.

Keisuke Watanabe, Director at SB C&S Corp.

Stability AI

By training our large-scale PyTorch-based ML models on Amazon EC2 P4d UltraClusters, leveraging Amazon EKS, AWS EFA, and Amazon FSx for Lustre, we were able to take advantage of cloud scale for distributed training of our diffusion and transformer models. The ability to scale our training using cloud services accelerated our time to train these large models from months to weeks to release them to the open source community. AWS has been a trusted partner and worked with us in lockstep to unblock technical issues and provided a seamless deployment experience. We plan to continue to train our state-of-the-art ML models on AWS.

Emad Mostaque, Founder, Stability AI

Iambic Therapeutics

Iambic Therapeutics is a technology-driven biotech startup that is disrupting the therapeutics landscape with its cutting-edge AI-driven drug discovery platform.

We execute design-make-test cycles on thousands of compounds per week, using PyTorch-based deep learning models running on AWS to generate molecular structures, perform property inferences, and fine-tune models on resulting bioassay data. We built a cloud scale inference pipeline using Amazon EKS, Karpenter, and KEDA to elastically scale our Amazon EC2 GPU powered compute instances based on custom metrics. This enables us to scale both model inference and fine-tuning, so that we can rapidly explore chemical space, discover differentiated drugs, and deliver them to clinic with unprecedented speed.

Fred Manby, CTO, Iambic Therapeutics

Amazon Search

Amazon Search works on products and services to enhance the end-user experience on Amazon.com. Search M5, a team within Amazon Search used Amazon Web Services (AWS) to run deep learning experiments for models with tens of billions of parameters. Search M5 used various AWS services to build, train, and deploy large ML models with multiple modalities at scale.

By continuing to increase our efficiency using AWS, we can unlock the possibilities of deep learning and artificial intelligence to benefit our customers.

Rejith Joseph, Principal Engineer, Amazon Search

AWS services to build, train, and deploy large ML models

Sestina Bio

Sestina Bio, a Inscripta company committed to creating a cleaner, healthier, and more sustainable world through biomanufacturing. A global leader in genome engineering, our innovations are designed to unlock the full potential of the bioeconomy.

We used Amazon EC2 G5 instances to run inference using a large PyTorch model, for protein sequence prediction. AWS offered agility and a strong partnership that helped us implement an architecture that quickly scaled up our workflow. Using this architecture, we were able to boost performance and operate on orders of magnitude more protein sequences.

Matt Biggs, PhD., Sr. Computational Biologist, Sestina BioI

Salesforce

Deep learning models on Amazon EKS with AWS Inferentia

Ampersand

Ampersand chose AWS Batch to run complex machine learning (ML) workloads to provide television advertisers with aggregated viewership insights and predictions for over 40 million households.

Before engaging AWS, we couldn’t run a quarter of the workloads that we wanted in parallel. Using AWS Batch, we could solve a lot of the issues that we were facing. Running parallel tasks would be much simpler. On AWS Batch, we scaled a cluster to support 50,000 workloads in less than 1 hour.

Jeffrey Enos, Senior Machine Learning Engineer, Ampersand

AWS Batch to run complex machine learning (ML) workloads

Rad AI, Inc.

Rad AI, Inc. empowers healthcare professionals with machine learning and artificial intelligence to improve the quality of patient care.

By migrating to Amazon EC2 P4d instances, we improved our real-time inference speeds by 60%.

Ali Demirci, Senior Software Engineer, Rad AI

AI21 Labs

AI21 develops large-scale language models focused on semantics and context and delivers artificial intelligence–based writing assistance through its flagship product, Wordtune.

Amazon EC2 P4d instances offer 400-Gbps high-performance networking on EFA. The GPU-to-GPU networking speed directly impacts the ability to scale efficiently and remain cost effective when scaling to hundreds of GPUs.

Opher Lieber, Technical Lead for Jurassic, AI21 Labs

Amazon Advertising

Amazon Advertising helps businesses of all sizes connect with customers at every stage of their shopping journey. Millions of ads, including text and images, are moderated, classified, and served for the optimal customer experience every day.

For our text ad processing, we deploy PyTorch-based BERT models globally on AWS Inferentia based Inf1 instances. By moving to Inferentia from GPUs, we were able to lower our cost by 69% with comparable performance. Compiling and testing our models for AWS Inferentia took less than 3 weeks. Using Amazon SageMaker to deploy our models to Inf1 instances ensured our deployment was scalable and easy to manage. When I first analyzed the compiled models, the performance with AWS Inferentia was so impressive that I actually had to re-run the benchmarks to make sure they were correct! Going forward we plan to migrate our image ad processing models to Inferentia. We have already benchmarked 30% lower latency and 71% cost savings over comparable GPU-based instances for these models.

Yashal Kanungo, Applied Scientist, Amazon Advertising

Autodesk

Autodesk achieved 4.9 times higher throughput over their GPU-based instances for their PyTorch-based NLP models, as well as cost reductions of up to 45 percent by using AWS Inferentia based Amazon EC2 Inf1 instances.

Toyota Research Institute - Advanced Development

Toyota Research Institute Advanced Development, Inc. (TRI-AD) is applying artificial intelligence to help Toyota produce cars in the future that are safer, more accessible and more environmentally friendly. Using PyTorch on Amazon EC2 P3 instances, TRI-AD reduced ML model training time from days to hours.

We continuously optimize and improve our computer vision models, which are critical to TRI-AD’s mission of achieving safe mobility for all with autonomous driving. Our models are trained with PyTorch on AWS, but until now PyTorch lacked a model serving framework. As a result, we spent significant engineering effort in creating and maintaining software for deploying PyTorch models to our fleet of vehicles and cloud servers. With TorchServe, we now have a performant and lightweight model server that is officially supported and maintained by AWS and the PyTorch community.

Yusuke Yachide, Lead of ML Tools, TRI-AD

Matroid

Matroid, maker of computer vision software that detects objects and events in video footage, develops a rapidly growing number of machine learning models using PyTorch on AWS and on-premise environments. The models are deployed using a custom model server that requires converting the models to a different format, which is time-consuming and burdensome. TorchServe allows Matroid to simplify model deployment using a single servable file that also serves as the single source of truth, and is easy to share and manage.

Pinterest has 3 billion images and 18 billion associations connecting those images. The company has developed PyTorch deep learning models to contextualize these images and deliver a personalized user experience. Pinterest uses Amazon EC2 P3 instances to speed up model training and deliver low latency inference for an interactive user experience.

Hyperconnect

Hyperconnect uses AI-based image classification on its video communication app to recognize the current environment wherein a user is situated.

We reduced our ML model training time from more than a week to less than a day by migrating from on-premises workstations to multiple Amazon EC2 P3 instances using Horovod. In addition, we chose PyTorch as our machine learning framework in order to leverage the libraries available in the open source community thus enabling quick iteration on model development.

Duolingo

Hyperconnect uses AI-based image classification on its video communication app to recognize the current environment wherein a user is situated.

We reduced our ML model training time from more than a week to less than a day by migrating from on-premises workstations to multiple Amazon EC2 P3 instances using Horovod. In addition, we chose PyTorch as our machine learning framework in order to leverage the libraries available in the open source community thus enabling quick iteration on model development.

Next steps

Getting started

Learn how to get started with PyTorch on AWS

Learn more

Free tier