AWS Machine Learning Blog
Category: Amazon EC2
Enable pod-based GPU metrics in Amazon CloudWatch
This post details how to set up container-based GPU metrics and provides an example of collecting these metrics from EKS pods.
Maximize Stable Diffusion performance and lower inference costs with AWS Inferentia2
Generative AI models have been experiencing rapid growth in recent months due to its impressive capabilities in creating realistic text, images, code, and audio. Among these models, Stable Diffusion models stand out for their unique strength in creating high-quality images based on text prompts. Stable Diffusion can generate a wide variety of high-quality images, including […]
Accelerate PyTorch with DeepSpeed to train large language models with Intel Habana Gaudi-based DL1 EC2 instances
Training large language models (LLMs) with billions of parameters can be challenging. In addition to designing the model architecture, researchers need to set up state-of-the-art training techniques for distributed training like mixed precision support, gradient accumulation, and checkpointing. With large models, the training setup is even more challenging because the available memory in a single […]
Optimized PyTorch 2.0 inference with AWS Graviton processors
New generations of CPUs offer a significant performance improvement in machine learning (ML) inference due to specialized built-in instructions. Combined with their flexibility, high speed of development, and low operating cost, these general-purpose processors offer an alternative to other existing hardware solutions. AWS, Arm, Meta and others helped optimize the performance of PyTorch 2.0 inference […]
Modulate makes voice chat safer while reducing infrastructure costs by a factor of 5 with Amazon EC2 G5g instances
This is a guest post by Carter Huffman, CTO and Co-founder at Modulate. Modulate is a Boston-based startup on a mission to build richer, safer, more inclusive online gaming experiences for everyone. We’re a team of world-class audio experts, gamers, allies, and futurists who are eager to build a better online world and make voice […]
Achieve four times higher ML inference throughput at three times lower cost per inference with Amazon EC2 G5 instances for NLP and CV PyTorch models
Amazon Elastic Compute Cloud (Amazon EC2) G5 instances are the first and only instances in the cloud to feature NVIDIA A10G Tensor Core GPUs, which you can use for a wide range of graphics-intensive and machine learning (ML) use cases. With G5 instances, ML customers get high performance and a cost-efficient infrastructure to train and […]
How Amazon Search reduced ML inference costs by 85% with AWS Inferentia
Amazon’s product search engine indexes billions of products, serves hundreds of millions of customers worldwide, and is one of the most heavily used services in the world. The Amazon Search team develops machine learning (ML) technology that powers the Amazon.com search engine and helps customers search effortlessly. To deliver a great customer experience and operate […]
Run AlphaFold v2.0 on Amazon EC2
After the article in Nature about the open-source of AlphaFold v2.0 on GitHub by DeepMind, many in the scientific and research community have wanted to try out DeepMind’s AlphaFold implementation firsthand. With compute resources through Amazon Elastic Compute Cloud (Amazon EC2) with Nvidia GPU, you can quickly get AlphaFold running and try it out yourself. […]
Achieve 12x higher throughput and lowest latency for PyTorch Natural Language Processing applications out-of-the-box on AWS Inferentia
AWS customers like Snap, Alexa, and Autodesk have been using AWS Inferentia to achieve the highest performance and lowest cost on a wide variety of machine learning (ML) deployments. Natural language processing (NLP) models are growing in popularity for real-time and offline batched use cases. Our customers deploy these models in many applications like support […]
AWS and NVIDIA to bring Arm-based Graviton2 instances with GPUs to the cloud
AWS continues to innovate on behalf of our customers. We’re working with NVIDIA to bring an Arm processor-based, NVIDIA GPU accelerated Amazon Elastic Compute Cloud (Amazon EC2) instance to the cloud in the second half of 2021. This instance will feature the Arm-based AWS Graviton2 processor, which was built from the ground up by AWS […]