Amazon EC2 G6e Instances
Most cost-efficient GPU-based instances for AI inference and spatial computing workloads
Why Amazon EC2 G6e Instances?
Amazon EC2 G6e instances powered by NVIDIA L40S Tensor Core GPUs are the most cost-efficient GPU instances for deploying generative AI models and the highest performance GPU instances for spatial computing workloads. They offer 2x higher GPU memory (48 GB), and 2.9x faster GPU memory bandwidth compared to G6 instances. G6e instances deliver up to 2.5x better performance compared to G5 instances.
Customers can use G6e instances to deploy large language models (LLMs) with up to 13B parameters and diffusion models for generating images, video, and audio. Additionally, the G6e instances will unlock customers’ ability to create larger, more immersive 3D simulations and digital twins for spatial computing workloads using NVIDIA Omniverse.
G6e instances feature up to 8 NVIDIA L40S Tensor Core GPUs with 384 GB of total GPU memory (48 GB of memory per GPU) and third generation AMD EPYC processors. They also support up to 192 vCPUs, up to 400 Gbps of network bandwidth, up to 1.536 TB of system memory, and up to 7.6 TB of local NVMe SSD storage.
Benefits
High performance and cost-efficiency for AI inference
G6e instances offer up to 1.2x more GPU memory than P4d instances enabling customers to deploy LLMs and diffusion models while saving up to 20% in cost. Powered by L40S GPUs that feature fourth-generation tensor cores, they are a highly performant and cost-efficient solution for customers who want to use NVIDIA libraries such as TensorRT, CUDA, and cuDNN to run their ML applications.
Cost-efficient training for moderately complex AI models
G6e instances offer the same networking bandwidth (400 Gbps) and up to 1.2x more GPU memory than P4d instances. This makes them well suited for cost-efficient single-node fine-tuning or training of smaller models.
Highest performance for spatial computing workloads
With 384 GB of total GPU memory paired with NVIDIA’s third generation ray tracing cores G6e instances deliver the highest performance for spatial computing workloads. The 400 Gbps of networking bandwidth improves real-time performance for spatial computing applications that are fed with real world inputs. These instances unlock new opportunities for customers to create more expansive 3D digital twins for workloads like factory planning, robot simulation, and network optimization.
Maximized resource efficiency
G6e instances are built on the AWS Nitro System, a combination of dedicated hardware and lightweight hypervisor which delivers practically all of the compute and memory resources of the host hardware to your instances for better overall performance and security. With G6e instances, the Nitro system provisions the GPUs in a pass-through mode, providing performance comparable to bare-metal.
Features
NVIDIA L40S Tensor Core GPU
G6e instances feature NVIDIA L40S Tensor Core GPUs that combine powerful AI compute with best-in-class graphics and media acceleration. Each instance features up to 8 L40S Tensor Core GPUs that come with 48 GB of memory per GPU, fourth-generation NVIDIA Tensor Cores, third-generation NVIDIA RT cores, and DLSS 3.0 technology.
NVIDIA drivers and libraries
G6e instances offer NVIDIA RTX Enterprise and gaming drivers to customers at no additional cost. NVIDIA RTX Enterprise drivers can be used to provide high quality virtual workstations for a wide range of graphics-intensive workloads. NVIDIA gaming drivers provide unparalleled graphics and compute support for game development. G6e instances also support CUDA, cuDNN, NVENC, TensorRT, cuBLAS, OpenCL, DirectX 11/12, Vulkan 1.3, and OpenGL 4.6 libraries.
High performance networking and storage
G6e instances come with up to 400 Gbps of networking throughput enabling them to support the low latency needs of machine learning inference and graphics-intensive applications. 48 GB of memory per GPU, up to 1.536 TB of system memory, and up to 7.6 TB of local NVMe SSD storage enable local storage of large models and datasets for high performance machine learning training and inference. G6e instances can also store large video files locally resulting in increased graphics performance and the ability to render larger and more complex video files.
Built on AWS Nitro System
G6e instances are built on the AWS Nitro System, which is a rich collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware and software to deliver high performance, high availability, and high security while also reducing virtualization overhead.
Product details
Instance Size
|
GPU
|
GPU Memory (GB)
|
vCPUs
|
Memory(GiB)
|
Storage (GB)
|
Network Bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
g6e.xlarge
|
1
|
48
|
4
|
32
|
250
|
Up to 20
|
Up to 5
|
g6e.2xlarge
|
1
|
48
|
8
|
64
|
450
|
Up to 20
|
Up to 5
|
g6e.4xlarge
|
1
|
48
|
16
|
128
|
600
|
20
|
8
|
g6e.8xlarge
|
1
|
48
|
32
|
256
|
900
|
25
|
16
|
g6e.16xlarge
|
1
|
48
|
64
|
512
|
1900
|
35
|
20
|
g6e.12xlarge
|
4
|
192
|
48
|
384
|
3800
|
100
|
20
|
g6e.24xlarge
|
4
|
192
|
96
|
768
|
3800
|
200
|
30
|
g6e.48xlarge
|
8
|
384
|
192
|
1536
|
7600
|
400
|
60
|
Customer and Partner testimonials
Here is an example of how customers and partners have achieved their business goals with Amazon EC2 G6e instances.
Leonardo.AI
Leonardo.AI offers a production suite for content creation that leverages generative AI technologies.
"Leonardo.Ai has over 20 million users and generates over 4.5 million new images daily. Since launching, users have generated more than 2 billion images and trained more than 400 thousand custom generative AI models on our platform. The Amazon EC2 G6e instances are ~25% more affordable than existing P4d instances for comparable performance for image generation inference. The higher performance enables us to unlock creativity and accelerate content production which in turn provides a better quality experience to our growing user base."
—Peter Runham, CTO, Leonardo.AI

Getting started with G6e instances
Using DLAMI or Deep Learning Containers
DLAMI provides ML practitioners and researchers with the infrastructure and tools to accelerate DL in the cloud, at any scale. Deep Learning Containers are Docker images preinstalled with DL frameworks to streamline the deployment of custom ML environments by letting you skip the complicated process of building and optimizing your environments from scratch.
Using Amazon EKS or Amazon ECS
If you prefer to manage your own containerized workloads through container orchestration services, you can deploy G6e instances with Amazon EKS or Amazon ECS.
Using AMIs for graphics workloads
You can use various Amazon Machine Images (AMIs) offered by AWS and NVIDIA offer that come with the NVIDIA drivers installed.