Posted On: Jul 26, 2023

Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P5 instances, powered by the latest NVIDIA H100 Tensor Core GPUs. These instances deliver the highest performance in Amazon EC2 for deep learning and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 6x and lower cost to train ML models by up to 40% compared to previous-generation GPU-based instances.

You can use Amazon EC2 P5 instances for training and deploying increasingly complex large language models (LLMs) and diffusion models powering the most demanding generative AI applications. This includes question answering, code generation, video and image generation, speech recognition, and more. You can also use P5 instances to deploy demanding HPC applications at scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling.

To deliver these performance improvements and cost savings, P5 instances pair NVIDIA H100 Tensor Core GPUs with 2x higher CPU performance, 2x higher system memory, and 4x higher local storage as compared to previous-generation GPU-based instances. They provide market-leading scale-out capabilities for distributed training and tightly coupled HPC workloads with up to 3,200 Gbps of networking using second-generation Elastic Fabric Adapter (EFA) technology. To address customer needs for large scale at low latency, P5 instances are deployed in Amazon EC2 UltraClusters. These provide petabit-scale nonblocking interconnect across up to 20,000 H100 GPUs, delivering up to 20 exaflops of aggregate compute capability.

P5 instances are now available in the US East (N. Virginia) and US West (Oregon) AWS Regions in the p5.48xl sizes.

To learn more about P5 instances, see Amazon EC2 P5 Instances.