Amazon EC2

Amazon EC2›
Instance types›
Amazon EC2 P6e-GB200 UltraServers and P6- B200 instances

Amazon EC2 P6e UltraServers and P6 instances

The highest GPU performance for AI training and inference

Reserve P6e UltraServers or P6 instances for future use

Get started with P6e UltraServers or P6 instances

Why Amazon EC2 P6e UltraServers and P6 instances?

Amazon Elastic Compute Cloud (Amazon EC2) P6e UltraServers, accelerated by NVIDIA GB200 NVL72, offer the highest GPU performance in Amazon EC2. P6e-GB200 features over 20x the compute and over 11x the memory under NVIDIA NVLinkTM compared to P5en instances. These UltraServers are ideal for the most compute- and memory-intensive AI workloads, such as training and deploying frontier models at the multi-trillion-parameter scale. P6e-GB300 UltraServers, accelerated by NVIDIA GB300 NVL72, deliver 1.5x GPU memory and 1.5x GPU TFLOPS (FP4, without sparsity) compared to P6e-GB200 instances. With close to 20TB of GPU memory per UltraServer, P6e-GB300 is ideal for AI models and use-cases at the trillion- parameter scale.

Amazon EC2 P6 instances, accelerated by NVIDIA Blackwell and Blackwell Ultra GPUs, are an ideal option for medium-to-large-scale training and inference applications. P6-B200 instances offer up to 2x the performance compared to P5en instances for AI training and inference while P6-B300 instances deliver high performance for large-scale AI training and inference. These instances are well suited for sophisticated models such as mixture of experts (MoE) and reasoning models with trillions of parameters.

P6e UltraServers and P6 instances enable faster training for next-generation AI models and improve performance for real-time inference in production. You can use P6e UltraServers and P6 instances to train frontier foundation models (FMs) such as MoE and reasoning models and deploy them in generative and agentic AI applications such as content generation, enterprise copilots, and deep research agents.

Benefits

P6e UltraServers

With P6e-GB300, customers can leverage 1.5x GPU memory and 1.5x GPU TFLOPS (FP4, without sparsity) compared to P6e-GB200 to improve performance for the most compute and memory intensive AI workloads.

With P6e-GB200 UltraServers, customers can access up to 72 Blackwell GPUs within one NVLink domain to use 360 petaflops of FP8 compute (without sparsity) and 13.4 TB of total high- bandwidth memory (HBM3e). P6e-GB200 UltraServers provide up to 130 terabytes per second of low-latency NVLink connectivity between GPUs and up to 28.8 terabits per second of total Elastic Fabric Adapter networking (EFAv4) for AI training and inference. This UltraServer architecture on P6e-GB200 enables customers to leverage a step change improvement in compute and memory, with up to 20x GPU TFLOPS, 11x GPU memory, and 15x aggregate GPU memory bandwidth under NVLink compared to P5en.

P6 instances

P6-B300 instances provide 8x NVIDIA Blackwell Ultra GPUs with 2.1 TB high bandwidth GPU memory, 6.4 Tbps EFA networking, 300 Gbps dedicated ENA throughput, and 4 TB of system memory. P6-B300 instances deliver 2x networking bandwidth, 1.5x GPU memory size, and 1.5x GPU TFLOPS (at FP4, without sparsity) compared to P6-B200 instances. These improvements make P6-B300 instances well suited for large-scale ML training and inference.

P6-B200 instances provide 8x NVIDIA Blackwell GPUs with 1440 GB of high- bandwidth GPU memory, 5th Generation Intel Xeon Scalable processors (Emerald Rapids), 2 TiB of system memory, up to 14.4 TBp/s of total bidirectional NVLink bandwidth, and 30 TB of local NVMe storage. These instances feature up to 2.25x GPU TFLOPs, 1.27x GPU memory size, and 1.6x GPU memory bandwidth compared to P5en instances.

P6e UltraServers and P6 instances are powered by the AWS Nitro System with specialized hardware and firmware designed to enforce restrictions so that no one, including anyone at AWS, can access your sensitive AI workloads and data. The Nitro System, which handles networking, storage, and other I/O functions, can deploy firmware updates, bug fixes, and optimizations while it remains operational. This increases stability and reduces downtime, which is critical to meeting training timelines and running AI applications in production.

To enable efficient distributed training, P6e UltraServers and P6 instances use fourth- generation Elastic Fabric Adapter networking (EFAv4). EFAv4 uses Scalable Reliable Datagram (SRD) protocol to intelligently route traffic across multiple network paths to maintain smooth operation even during congestion or failures.

P6e UltraServers and P6 instances are deployed in Amazon EC2 UltraClusters, which enable scaling up to tens of thousands of GPUs within a petabit-scale nonblocking network.

Features

Each NVIDIA Blackwell GPU found in P6-B200 instances features a second-generation Transformer Engine and supports new precision formats such as FP4. It supports fifth- generation NVLink, a faster, wider interconnect delivering up to 1.8 TBp/s of bandwidth per GPU.

The Grace Blackwell Superchip, a key component of P6e-GB200, connects two high- performance NVIDIA Blackwell GPUs and an NVIDIA Grace CPU using the NVIDIA NVLink-C2C interconnect. Each Superchip delivers 10 petaflops of FP8 compute (without sparsity) and up to 372 GB of HBM3e. With the superchip architecture, 2 GPUs and 1 CPU are co-located within one compute module, increasing bandwidth between GPU and CPU by an order of magnitude compared to current generation P5en instances.

The NVIDIA Blackwell Ultra GPUs powering P6-B300 instances deliver a 2x increase in network bandwidth, 1.5x increase in GPU memory, and up to 1.5x FP4 compute improvements (without sparsity) in effective TFLOPs compared to P6-B200 instances.

The Grace Blackwell Superchip found in P6e- GB300 UltraServers connects two NVIDIA Blackwell Ultra GPUs with one NVIDIA Grace CPU, delivering 1.5x GPU memory and up to 1.5x FP4 compute improvements (without sparsity).

P6e UltraServers and P6 instances provide 400 GB ps per GPU of EFAv4 networking for a total of 28.8 Tbps per P6e-GB200 UltraServer and 3.2 Tbps per P6-B200 instance.

P6-B300 instances offer 6.4 Tbps networking bandwidth, 2x compared to P6-B200 instances due to PCle Gen6, and are designed for large-scale distributed deep learning model training.

P6e UltraServers and P6 instances support Amazon FSx for Lustre file systems so you can access data at hundreds of GBp/s of throughput and millions of IOPS required for large-scale AI training and inference. P6e UltraServers support up to 405 TB of local NVMe SSD storage while P6 instances support up to 30 TB of local NVMe SSD storage for fast access to large datasets. You can also use virtually unlimited cost-effective storage with Amazon Simple Storage Service (Amazon S3).

Product Details

Instance types

Instance Size	Blackwell GPUs	GPU memory (GB)	vCPUs	System memory (GiB)	Instance storage (TB)	Network bandwidth (Tbps)	EBS bandwidth (Gbps)	Available in EC2 UltraServers
p6-b300.48xlarge	8 Ultra	2,144 HBM3e	192	4,096	8 x 3.84	6.4	100	No
p6-b200.48xlarge	8	1,432 HBM3e	192	2,048	8 x 3.84	3.2	100	No
p6e-gb200.36xlarge	4	740 HBM3e	144	960	3 x 7.5	3.2	60	Yes*

*P6e-GB200 instances are only available in UltraServers

UltraServer types

Instance Size	Blackwell GPUs	GPU memory (GB)	vCPUs	System memory (GiB)	UltraServer Storage (TB)	Aggregate EFA bandwidth (Gbps)	EBS bandwidth (Gbps)	Available in EC2 UltraServers
u-p6e-gb200x72	72	13,320	2,592	17,280	405	28,800	1,080	Yes
u-p6e-gb200x36	36	6,660	1,296	8,640	202.5	14,400	540	Yes

Getting started with ML use cases

Amazon SageMaker AI is a fully managed service for building, training, and deploying ML models. With Amazon SageMaker HyperPod, you can more easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up and managing resilient training clusters. (P6e-GB200 support coming soon)

AWS Deep Learning AMIs (DLAMI) provides ML practitioners and researchers with the infrastructure and tools to accelerate DL in the cloud, at any scale. AWS Deep Learning Containers are Docker images preinstalled with DL frameworks to streamline the deployment of custom ML environments by letting you skip the complicated process of building and optimizing your environments from scratch.

If you prefer to manage your own containerized workloads through container orchestration services, you can deploy P6e-GB200 UltraServers and P6-B200 instances with Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS).

P6e UltraServers will also be available through NVIDIA NVIDA DGX Cloud, a fully managed environment with NVIDIA’s complete AI software stack. With NVIDIA DGX Cloud you get NVIDIA’s latest optimizations, benchmarking recipes, and technical expertise.

Learn more

Getting started with AWS

Step 1 : Sign up for an AWS account

Instantly get access to the AWS Free Tier

Learn more

Step 2: Learn with 10-minute tutorials

Explore and learn with simple tutorials

Learn more

Step 3: Start building with AWS

Begin building with step-by-step guides to help you launch your AWS project

Learn more

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages

Amazon EC2 P6e UltraServers and P6 instances

Why Amazon EC2 P6e UltraServers and P6 instances?

Benefits

Features

Product Details

Instance types

UltraServer types

Getting started with ML use cases

Getting started with AWS

Step 1 : Sign up for an AWS account

Step 2: Learn with 10-minute tutorials

Step 3: Start building with AWS

Did you find what you were looking for today?

Learn

Resources

Developers

Help

Amazon EC2 P6e UltraServers and P6 instances

Why Amazon EC2 P6e UltraServers and P6 instances?

Benefits

Maximize training and inference performance at scale

Get enhanced security and stability with the AWS Nitro System

Reliably scale AI training across high-performing GPU clusters

Features

NVIDIA Blackwell GPUs

High-performance networking

High-performance storage

Product Details

Instance types

UltraServer types

Getting started with ML use cases

Using Amazon SageMaker AI

Using AWS DLAMI or AWS Deep Learning Containers

Using Amazon EKS or Amazon ECS

Using DGX Cloud in AWS

Getting started with AWS

Step 1 : Sign up for an AWS account

Step 2: Learn with 10-minute tutorials

Step 3: Start building with AWS

Did you find what you were looking for today?

Learn

Resources

Developers

Help