Posted On: Jan 18, 2019
Amazon Elastic Inference is a service that lets you attach accelerators to any Amazon SageMaker or Amazon EC2 instance type to speed up deep learning inference workloads. Elastic Inference accelerators provide you with the low latency, high throughput benefits of GPU acceleration at a much lower cost (up to 75%). You can use Elastic Inference to deploy TensorFlow, Apache MXNet, and ONNX models for inference.
Amazon Elastic Inference now supports the latest version of TensorFlow 1.12. It provides EIPredictor, a new easy-to-use Python API function for deploying TensorFlow models using Amazon Elastic Inference accelerators. EIPredictor allows for easy experimentation and lets you compare performance with and without Amazon Elastic Inference. To learn more about running TensorFlow models using Amazon Elastic Inference please see this blog post.
To learn about Amazon Elastic Inference visit the web page and the documentation user guide.