Amazon EC2 P3 Instances
FASTER MACHINE LEARNING TRAINING
EC2 P3 instances can accelerate model training times to only a few hours or minutes, enabling data scientists to iterate faster, train more models, and build a competitive edge into their applications.
SOLVE BIG PROBLEMS QUICKLY
Gain new insights and increase the speed of research by running high performance computing in the cloud. Scale out your infrastructure with the flexibility to change resources easily and as often as your workload demands.
INTEGRATION WITH AWS MACHINE LEARNING SERVICES
Amazon SageMaker - a fully-managed machine learning platform that enables you to quickly and easily build, train, and deploy machine learning models - and the Amazon Deep Learning Amazon Machine Images (AMIs) make it easier to get started with training and inference while using EC2 P3 instances.
LOW COST AND GLOBAL AVAILABILITY
Available via On-Demand Instances, Reserved Instances, Spot Instances, and Dedicated Hosts. EC2 P3 instances are available in 8 AWS regions across 18 availability zones so customers have the flexibility to train and deploy their models wherever their data is stored.
Amazon EC2 P3 Instances and SageMaker
The Fastest Way to Train and Run Machine Learning Models
SageMaker is a fully-managed service for building, training, and deploying machine learning models. When used together with EC2 P3, customers can easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up clusters and data pipelines. For further details about SageMaker, click here.
Amazon SageMaker makes it easy to build machine learning models and get them ready for training by providing everything you need to quickly connect to your training data, and to select and optimize the best algorithm and framework for your application. Amazon SageMaker includes hosted Jupyter notebooks that make it easy to explore and visualize your training data stored in Amazon S3.
You can begin training your model with a single click in the console or with a simple API call. Amazon SageMaker is pre-configured with the latest versions of TensorFlow and Apache MXNet, with CUDA9 library support for maximum performance with NVIDIA GPUs. In addition, hyper-parameter optimization (HPO), can automatically tune your model by intelligently adjusting different combinations of model parameters to quickly arrive at the most accurate predictions the model is capable of producing.
After training, you can one-click deploy your model onto auto-scaling EC2 instances across multiple availability zones. Once in production, SageMaker manages the compute infrastructure on your behalf to perform health checks, apply security patches, and conduct other routine maintenance, all with built-in Amazon CloudWatch monitoring and logging.
Amazon EC2 P3 Instances and Deep Learning AMIs
Pre-configured development environments to quickly start building deep learning applications
An alternative to Amazon SageMaker for developers who have more customized requirements, the AWS Deep Learning AMIs provide machine learning practitioners and researchers with the infrastructure and tools to accelerate deep learning in the cloud, at any scale. You can quickly launch Amazon EC2 P3 instances pre-installed with popular deep learning frameworks such as TensorFlow, PyTorch, Apache MXNet, Microsoft Cognitive Toolkit, Caffe, Caffe2, Theano, Torch, Chainer, Gluon, and Keras to train sophisticated, custom AI models, experiment with new algorithms, or to learn new skills and techniques. For more information, click here.
|Instance Size||GPUs - Tesla V100||GPU Peer to Peer||GPU Memory (GB)||vCPUs||Memory (GB)||Network Bandwidth||EBS Bandwidth||On-Demand Price/hr*||1-yr Reserved Instance Effective Hourly*||3-yr Reserved Instance Effective Hourly*|
|p3.2xlarge||1||N/A||16||8||61||Up to 10 Gbps||1.5 Gbps||
|p3.8xlarge||4||NVLink||64||32||244||10 Gbps||7 Gbps||
|p3.16xlarge||8||NVLink||128||64||488||25 Gbps||14 Gbps||
*Prices shown are for Linux/Unix in US East (Northern Virginia) AWS Region. For full pricing details, see the Amazon EC2 pricing page.
P3 instances are available in AWS US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), Asia Pacific (Seoul), Asia Pacific (Tokyo), AWS GovCloud (US) and China (Beijing) Regions. Customers can purchase P3 instances as On-Demand Instances, Reserved Instances, Spot Instances, and Dedicated Hosts.
Powerful NVIDIA GPUs
Amazon EC2 P3 instances are powered by up to 8 of the latest-generation NVIDIA Tesla V100 GPUs. Based on NVIDIA’s latest Volta architecture, each Tesla V100 GPU provides 125 TFLOPS of mixed-precision performance, 15.7 TFLOPS of single precision (FP32) performance and 7.8 TFLOPS of double precision (FP64) performance.
NVLINK GPU-to-GPU Communication
Systems requiring multiple GPUs are becoming common in a variety of industries as developers rely on more parallelism in applications. A single NVIDIA Tesla® V100 GPU supports up to six NVLink GPU to GPU connections and total bandwidth of 300 GB/sec - 10X the bandwidth of PCIe Gen 3.
Support for All Major Machine Learning Frameworks
Amazon EC2 P3 instances support all major machine learning frameworks including TensorFlow, PyTorch, Apache MXNet, Caffe, Caffe2, Microsoft Cognitive Toolkit (CNTK), Chainer, Theano, and Keras.
Airbnb’s community marketplace provides access to millions of unique accommodations and local experiences in more than 65,000 cities and 191 countries. David Benson at Airbnb said,
“At Airbnb, we’re using machine learning to optimize search recommendations and improve dynamic pricing guidance for hosts, both of which translate to increased booking conversions. These use-cases are highly specific to our industry and require machine learning models that use several different types of data sources, such as guest and host preferences, listing location and condition, seasonality, and price.
With Amazon EC2 P3 instances, we have the ability to run training workloads faster, enabling us to iterate more, build better machine learning models and reduce cost.”
Western Digital is an industry-leading provider of storage technologies and solutions that enable people to create, leverage, experience and preserve data. David Hinz, Senior Director Cloud and Data Center Operations, said,
"Our engineering and product development teams use high performance computing to run 10’s of thousands of simulations for all areas needed to deliver new hard disk drive (HDD) and solid state storage solutions. The simulations include materials sciences, heat flows, magnetics and data transfer simulations to improve disk drive and storage solution performance and quality.
Based upon early testing, the new P3 instances can allow engineering teams to run GPU-accelerated modeling and simulations at least three times faster than currently deployed GPU solutions. We are looking forward to using the P3 instances in production as a cost-effective and performant way to provide HPC solutions to our engineering teams.”
Salesforce is a cloud-based customer relationship management (CRM) software solution for sales, service, marketing, collaboration, analytics, and building custom apps and mobile apps.
"With Salesforce Einstein Vision, developers of all skill levels can harness the power of image recognition by training their own deep learning models to enrich sales leads, automate service case resolution and optimize marketing campaigns. With Amazon EC2 P3 instances we will have access to the latest GPU technology, enabling us to train our deep learning models much faster so we can continue to maintain—and raise—the high bar we set for customer success."
Schrödinger’s mission is to improve human health and quality of life by developing advanced computational methods that transform the way scientists design therapeutics and materials. Robert Abel, Senior Vice President of Science at Schrödinger, said,
“Our industry has a pressing need for performant, accurate, and predictive models to extend the scale of discovery and optimization, complementing and going beyond the traditional experimental approach.
Amazon EC2 P3 instances with their high performance GPUs allow us to perform four times as many simulations in a day as we could with P2 instances. This performance increase, coupled with the ability to quickly scale in response to new compound ideas, gives our customers the ability to bring lifesaving drugs to market more quickly.”
Get started with P3 instances for Machine Learning
To get started within minutes, learn more about Amazon SageMaker or use the Amazon Deep Learning AMI, pre-installed with popular deep learning frameworks such as Caffe2 and Mxnet. Alternatively, you can also use the NVIDIA AMI with GPU driver and CUDA toolkit pre-installed.