Amazon EC2 P5e instances are generally available via EC2 Capacity Blocks

Posted on: Sep 9, 2024

Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P5e instances, powered by the latest NVIDIA H200 Tensor Core GPUs. Available via EC2 Capacity Blocks, these instances deliver the highest performance in Amazon EC2 for deep learning and generative AI inference.

You can use Amazon EC2 P5e instances for training and deploying increasingly complex large language models (LLMs) and diffusion models powering the most demanding generative AI applications. You can also use P5e instances to deploy demanding HPC applications at scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling.

P5e instances feature 8 H200 GPUs which have 1.7x GPU memory size and 1.5x GPU memory bandwidth than H100 GPUs featured in P5 instances. They provide market-leading scale-out capabilities for distributed training and tightly coupled HPC workloads with up to 3,200 Gbps of networking using second-generation Elastic Fabric Adapter (EFA) technology. To address customer needs for large scale at low latency, P5e instances are deployed in Amazon EC2 UltraClusters.

P5e instances are now available in the US East (Ohio) AWS Region in the p5e.48xlarge sizes through EC2 Capacity Blocks for ML.

To learn more about P5e instances, see Amazon EC2 P5e Instances.