Sign in
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Hugging Face Infinity - CPU

Hugging Face Infinity - CPU

By: Hugging Face Latest Version: v0.4.0

This version has been removed and is no longer available to new customers.

Product Overview

Hugging Face Infinity is a containerized solution for customers to deploy end-to-end optimized inference pipelines for State-of-the-Art Transformer models, on any infrastructure. An Infinity Container is designed to serve 1 Model and 1 Task. A Task corresponds to machine learning tasks as defined in the Transformers Pipelines documentation.

Infinity Container is a hardware-optimized inference solution. The Infinity Container is built specifically to run on a Target Hardware architecture and exposes an HTTP API to run inference. The Infinity Containers are optimized for dedicated Target Hardware, which means not every container can run on any platform. Make sure you always run an Infinity Container on compatible Target Hardware. Each Infinity Container contains the Target Hardware in the image tag.



Operating System


Delivery Methods

  • Container

Pricing Information

Usage Information

Support Information

Customer Reviews