AWS News Blog
Now Available – EC2 Instances (G4) with NVIDIA T4 Tensor Core GPUs
|
The NVIDIA-powered G4 instances that I promised you earlier this year are available now and you can start using them today in nine AWS regions in six sizes! You can use them for machine learning training & inferencing, video transcoding, game streaming, and remote graphics workstations applications.
The instances are equipped with up to four NVIDIA T4 Tensor Core GPUs, each with 320 Turing Tensor cores, 2,560 CUDA cores, and 16 GB of memory. The T4 GPUs are ideal for machine learning inferencing, computer vision, video processing, and real-time speech & natural language processing. The T4 GPUs also offer RT cores for efficient, hardware-powered ray tracing. The NVIDIA Quadro Virtual Workstation (Quadro vWS) is available in AWS Marketplace. It supports real-time ray-traced rendering and can speed creative workflows often found in media & entertainment, architecture, and oil & gas applications.
G4 instances are powered by AWS-custom Second Generation Intel® Xeon® Scalable (Cascade Lake) processors with up to 64 vCPUs, and are built on the AWS Nitro system. Nitro’s local NVMe storage building block provides direct access to up to 1.8 TB of fast, local NVMe storage. Nitro’s network building block delivers high-speed ENA networking. The Intel AVX512-Deep Learning Boost feature extends AVX-512 with a new set of Vector Neural Network Instructions (VNNI for short). These instructions accelerate the low-precision multiply & add operations that reside in the inner loop of many inferencing algorithms.
Here are the instance sizes:
Instance Name |
NVIDIA T4 Tensor Core GPUs | vCPUs | RAM | Local Storage | EBS Bandwidth | Network Bandwidth |
g4dn.xlarge | 1 | 4 | 16 GiB | 1 x 125 GB | Up to 3.5 Gbps | Up to 25 Gbps |
g4dn.2xlarge | 1 | 8 | 32 GiB | 1 x 225 GB | Up to 3.5 Gbps | Up to 25 Gbps |
g4dn.4xlarge | 1 | 16 | 64 GiB | 1 x 225 GB | Up to 3.5 Gbps | Up to 25 Gbps |
g4dn.8xlarge | 1 | 32 | 128 GiB | 1 x 900 GB | 7 Gbps | 50 Gbps |
g4dn.12xlarge | 4 | 48 | 192 GiB | 1 x 900 GB | 7 Gbps | 50 Gbps |
g4dn.16xlarge | 1 | 64 | 256 GiB | 1 x 900 GB | 7 Gbps | 50 Gbps |
We are also working on a bare metal instance that will be available in the coming months:
Instance Name |
NVIDIA T4 Tensor Core GPUs | vCPUs | RAM | Local Storage | EBS Bandwidth | Network Bandwidth |
g4dn.metal | 8 | 96 | 384 GiB | 2 x 900 GB | 14 Gbps | 100 Gbps |
If you want to run graphics workloads on G4 instances, be sure to use the latest version of the NVIDIA AMIs (available in AWS Marketplace) so that you have access to the requisite GRID and Graphics drivers, along with an NVIDIA Quadro Workstation image that contains the latest optimizations and patches. Here’s where you can find them:
- NVIDIA Gaming – Windows Server 2016
- NVIDIA Gaming – Windows Server 2019
- NVIDIA Gaming – Ubuntu 18.04
The newest AWS Deep Learning AMIs include support for G4 instances. The team that produces the AMIs benchmarked a g3.16xlarge instance against a g4dn.12xlarge instance and shared the results with me. Here are some highlights:
- MxNet Inference (resnet50v2, forward pass without MMS) – 2.03 times faster.
- MxNet Inference (with MMS) – 1.45 times faster.
- MxNet Training (resnet50_v1b, 1 GPU) – 2.19 times faster.
- Tensorflow Inference (resnet50v1.5, forward pass) – 2.00 times faster.
- Tensorflow Inference with Tensorflow Service (resnet50v2) – 1.72 times faster.
- Tensorflow Training (resnet50_v1.5) – 2.00 times faster.
The benchmarks used FP32 numeric precision; you can expect an even larger boost if you use mixed precision (FP16) or low precision (INT8).
You can launch G4 instances today in the US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), Europe (Frankfurt), Europe (Ireland), Europe (London), Asia Pacific (Seoul), and Asia Pacific (Tokyo) Regions, in Amazon SageMaker, and (as of October 1, 2019) Amazon EKS clusters.
— Jeff;