Amazon EC2 P3 Instances
REDUCE MACHINE LEARNING TRAINING TIME FROM DAYS TO MINUTES
For data scientists, researchers, and developers who need to speed up ML applications, Amazon EC2 P3 instances are the most powerful of any GPU compute available in the cloud. Amazon EC2 P3 instances feature up to eight latest-generation NVIDIA Tesla V100 GPUs and deliver up to 1 petaflop of mixed-precision performance to significantly accelerate ML workloads. Faster model training can enable data scientists and machine learning engineers to iterate faster, train more models, and increase accuracy.
THE INDUSTRY'S MOST COST-EFFECTIVE SOLUTION
Amazon EC2 P3 instances offer different pricing plans to give you cost savings depending on your needs. In addiition to On-Demand Instances, where you pay for the instances that you launch, you can purchase Reserved Instances at a significant discount, instances that are always available, for a term from one to three years. You can also use Spot Instances, which take advantage of unused EC2 instances, which can lower your Amazon EC2 costs significantly.
FLEXIBLE, POWERFUL HIGH PERFORMANCE COMPUTING
Unlike on-premises systems, running high performance computing on Amazon EC2 P3 instances offers virtually unlimited capacity to scale out your infrastructure and the flexibility to change resources easily and as often as your workload demands. You can configure your resources to meet the demands of your application, and launch an HPC Cluster in minutes, paying for only what you use.
INTEGRATION WITH AWS MACHINE LEARNING SERVICES
Amazon EC2 P3 instances work seamlessly together with Amazon SageMaker to provide a powerful and intuitive complete machine learning platform. Amazon SageMaker is a fully-managed machine learning platform that enables you to quickly and easily build, train, and deploy machine learning models. Furthermore, Amazon EC2 P3 instances can be integrated with AWS Deep Learning Amazon Machine Images (AMIs) that are pre-installed with popular deep learning frameworks to make it easier to get started with training and inferences.
SUPPORT FOR ALL MAJOR MACHINE LEARNING FRAMEWORKS
Amazon EC2 P3 instances support all major machine learning frameworks including TensorFlow, PyTorch, Apache MXNet, Caffe, Caffe2, Microsoft Cognitive Toolkit (CNTK), Chainer, Theano, Keras, Gluon, and Torch. Users can choose the framework that works best for their application.
Scalable Multi-Node Machine Learning Training
Customers can use multiple EC2 P3 instances to rapidly train machine learning models. A storage cluster and a compute cluster can be configured so that the storage cluster stores the training and validation datasets and is responsible for passing data to the compute cluster, while the compute cluster performs the forward passes, back propagation, and weight updates.
Airbnb is using machine learning to optimize search recommendations and improve dynamic pricing guidance for hosts, both of which translate to increased booking conversions. With Amazon EC2 P3 instances, Airbnb has the ability to run training workloads faster, go through more iterations, build better machine learning models and reduce cost.
Salesforce is using machine learning to power Einstein Vision, enabling developers to harness the power of image recognition for use cases such as visual search, brand detection, and product identification. Amazon EC2 P3 instances enable developers to train deep learning models much faster so that they can achieve their machine learning goals quickly.
Western Digital uses High Performance Computing (HPC) to run 10’s of thousands of simulations for materials sciences, heat flows, magnetics and data transfer to improve disk drive and storage solution performance and quality. Based on early testing, Amazon EC2 P3 instances allow engineering teams to run simulations at least three times faster than previously deployed solutions.
Schrodinger uses high performance computing (HPC) to develop predictive models to extend the scale of discovery and optimization and give their customers the ability to bring lifesaving drugs to market more quickly. Amazon EC2 P3 instances allows Schrodinger to perform four times as many simulations in a day as they could with P2 instances.
Amazon EC2 P3 Instances and Amazon SageMaker
The Fastest Way to Train and Run Machine Learning Models
Amazon SageMaker is a fully-managed service for building, training, and deploying machine learning models. When used together with Amazon EC2 P3 instances, customers can easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up clusters and data pipelines. You can also easily access Amazon Virtual Private Cloud (VPC) resources for training and hosting workflows in Amazon SageMaker. With this feature, you can use Amazon Simple Storage Service (S3) buckets that are only accessible through your VPC to store training data, as well as to store and host the model artifacts derived from the training process. In addition to S3, models can access all other AWS resources contained within the VPC. Learn more.
Amazon SageMaker makes it easy to build machine learning models and get them ready for training by providing everything you need to quickly connect to your training data, and to select and optimize the best algorithm and framework for your application. Amazon SageMaker includes hosted Jupyter notebooks that make it easy to explore and visualize your training data stored in Amazon S3. You can also use the notebook instance to write code to create model training jobs, deploy models to Amazon SageMaker hosting, and test or validate your models.
You can begin training your model with a single click in the console or with a simple API call. Amazon SageMaker is pre-configured with the latest versions of TensorFlow and Apache MXNet, with CUDA9 library support for optimal performance with NVIDIA GPUs. In addition, hyper-parameter optimization can automatically tune your model by intelligently adjusting different combinations of model parameters to quickly arrive at the most accurate predictions. For larger scale needs, you can scale to tens of instances to support faster model building.
After training, you can one-click deploy your model onto auto-scaling EC2 instances across multiple availability zones. Once in production, Amazon SageMaker manages the compute infrastructure on your behalf to perform health checks, apply security patches, and conduct other routine maintenance, all with built-in Amazon CloudWatch monitoring and logging.
Amazon EC2 P3 Instances and AWS Deep Learning AMIs
Pre-configured development environments to quickly start building deep learning applications
An alternative to Amazon SageMaker for developers who have more customized requirements, the AWS Deep Learning AMIs provide machine learning practitioners and researchers with the infrastructure and tools to accelerate deep learning in the cloud, at any scale. You can quickly launch Amazon EC2 P3 instances pre-installed with popular deep learning frameworks such as TensorFlow, PyTorch, Apache MXNet, Microsoft Cognitive Toolkit, Caffe, Caffe2, Theano, Torch, Chainer, Gluon, and Keras to train sophisticated, custom AI models, experiment with new algorithms, or to learn new skills and techniques. Learn more.
Amazon EC2 P3 Instances and High Performance Computing
Solve large computational problems and gain new insights using the power of HPC on AWS
Amazon EC2 P3 instances are an ideal platform to run engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, rendering, and other GPU compute workloads. High Performance Computing (HPC) allows scientists and engineers to solve these complex, compute-intensive problems. HPC applications often require high network performance, fast storage, large amounts of memory, very high compute capabilities, or all of these. AWS enables you to increase the speed of research and reduce time-to-results by running HPC in the cloud and scaling to larger numbers of parallel tasks than would be practical in most on-premises environments. AWS helps to reduce costs by providing solutions optimized for specific applications, and without the need for large capital investments. Learn more.
Amazon EC2 P3 Instance Product Details
|Instance Size||GPUs - Tesla V100||GPU Peer to Peer||GPU Memory (GB)||vCPUs||Memory (GB)||Network Bandwidth||EBS Bandwidth||On-Demand Price/hr*||1-yr Reserved Instance Effective Hourly*||3-yr Reserved Instance Effective Hourly*|
|p3.2xlarge||1||N/A||16||8||61||Up to 10 Gbps||1.5 Gbps||
|p3.8xlarge||4||NVLink||64||32||244||10 Gbps||7 Gbps||
|p3.16xlarge||8||NVLink||128||64||488||25 Gbps||14 Gbps||
*Prices shown are for Linux/Unix in US East (Northern Virginia) AWS Region. For full pricing details, see the Amazon EC2 pricing page.
P3 instances are available in AWS US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), Asia Pacific (Seoul), Asia Pacific (Tokyo), AWS GovCloud (US) and China (Beijing) Regions. Customers can purchase P3 instances as On-Demand Instances, Reserved Instances, Spot Instances, and Dedicated Hosts.
BILLING BY THE SECOND
One of the many advantages of cloud computing is the elastic nature of provisioning or deprovisioning resources as you need them. By billing usage down to the second, we enable customers to level up their elasticity, save money, and enable them to optimize allocation of resources toward achieving their machine learning goals.
RESERVED INSTANCE PRICING
Reserved Instances provide you with a significant discount (up to 75%) compared to On-Demand instance pricing. In addition, when Reserved Instances are assigned to a specific Availability Zone, they provide a capacity reservation, giving you additional confidence in your ability to launch instances when you need them.
With Spot instances, you pay the Spot price that's in effect for the time period your instances are running. Spot instance prices are set by Amazon EC2 and adjust gradually based on long-term trends in supply and demand for Spot instance capacity. Spot instances are available at a discount of up to 90% off compared to On-Demand pricing.
The Broadest Global Availability
Amazon EC2 P3 instances are available in 8 AWS regions across 18 availability zones (AZs) so customers have the flexibility to train and deploy their models wherever their data is stored. Available regions for EC2 P3 are the US East (N. Virginia), US West (Oregon), US East (Ohio), Europe West (Ireland), Asia Pacific (Tokyo), Asia Pacific (Beijing), Asia Pacific (Seoul) and GovCloud (US) regions.
Get started with Amazon EC2 P3 instances for Machine Learning
To get started within minutes, learn more about Amazon SageMaker or use the AWS Deep Learning AMI, pre-installed with popular deep learning frameworks such as Caffe2 and Mxnet. Alternatively, you can also use the NVIDIA AMI with GPU driver and CUDA toolkit pre-installed.