HUGS (Hugging Face Generative AI Services)

Hugging Face Generative AI Services (HUGS) are optimized, zero-configuration inference and training microservices designed to simplify and accelerate the development of AI models. Built on open-source Hugging Face technologies such as Text Generation Inference and Transformers.

0 AWS reviews

View purchase options

Try for free

Overview

Hugging Face Generative AI Microservices (HUGS Inference) empowers you to rapidly deploy and scale open-source generative AI models with zero configuration. Leveraging optimized inference engines for leading hardware like NVIDIA GPUs, AMD GPUs, Intel GPUs, AWS Inferentia, Habana Gaudi, and Google TPUs, HUGS delivers unparalleled performance. Seamlessly integrate models via the industry-standard OpenAI API, simplifying development with tools like LangChain and LlamaIndex. Focus on building cutting-edge applications, not complex deployments. Benefit from enterprise-grade security, control, and compliance features, including SLAs and SOC2.

HUGS Inference supports a diverse range of popular open-source LLMs, multimodal, and embedding models, including Meta-Llama, Mistral, Qwen, Gemma, and more. Choose between optimized container versions (turbo and light) to balance performance and resource requirements. Deploy pre-configured microservices tailored to your hardware, eliminating manual setup and maximizing efficiency. With HUGS, go from concept to production in minutes, not weeks.

Keywords: Generative AI, Inference, Microservices, Open-Source Models, LLMs, Multimodal Models, Embedding Models, Zero-Configuration, Optimized Inference, NVIDIA GPUs, AMD GPUs, Intel GPUs, AWS Inferentia, Habana Gaudi, Google TPUs, OpenAI API, LangChain, LlamaIndex, Kubernetes, Scalability, Security, Compliance, SLA, SOC2, Enterprise-Ready, Hugging Face, Text Generation Inference, Transformers, Meta-Llama, Mistral, Qwen, Gemma.

Highlights

Optimized to run open LLMs on NVIDIA GPU and AWS Accelerators with Inferentia and Trainium
powered by Hugging Face Open Source Technologies, like Text Generation Inference
Build for Enterprise on standardized on the OpenAI API, enabling companies to switch from closed models to open models with a single configuration change

Details

Sold by

Hugging Face

Unlock automation with AI agent solutions

Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.

Explore AI agent solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

Free trial

Try for free

Try this product free for 5 days according to the free trial terms set by the vendor. Usage-based pricing is in effect for usage beyond the free trial terms. Your free trial gets automatically converted to a paid subscription when the trial ends, but may be canceled any time before that.

HUGS (Hugging Face Generative AI Services)

Info

View purchase options

Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

Usage costs (1)

Info

Dimension	Description	Cost/unit/hour
Hours	Container Hours	$1.00

Vendor refund policy

If you are not satisfied with your purchase, you may request a refund by providing a detailed explanation and contacting us at api-enterprise@huggingface.co .

How can we make this page better?

We'd like to hear your feedback and ideas on how to improve this page.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

HUGS v2 for NVIDIA GPUs and AWS Inferenetia2

Supported services: Learn more

Amazon ECS
Amazon EKS

Container image

Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.

Version release notes

Added AWS Infernetia2 supported images

Additional details

Usage instructions

Instructions how to deploy container using AWS EKS and our helm chart https://huggingface.co/docs/hugs/how-to/cloud/aws

Support

Vendor support

Please contact api-enterprise@huggingface.co for support

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Similar products

GPT-OSS-20B LLM Inference Server by Yobitel

By Yobitel

This product has charges associated with it for expert support, configuration, and maintenance services. OpenAI GPT OSS 20B is an open source large language model inference platform, offering advanced capabilities in natural language processing, text generation, summarization, and AI-driven automation. Preconfigured for GPU acceleration on AWS, it enables efficient deployment of generative AI applications for research, enterprise, and production environments.

View product

Multi-Model Text-to-Image Inference server by Yobitel

By Yobitel

This product has charges associated with it for expert support, configuration, and maintenance services.The Multi-Model AI Inference AMI provides an integrated environment featuring Pre-loaded Stability AI Stable Diffusion 3.5-Medium, Black-Forest Labs FLUX.1-dev, and ByteDance SDXL-Lightning these all pre-configured for GPU-accelerated inference. It enables text-to-image generation and model benchmarking through a integrated Gradio web interface. Ideal for AI researchers, artists, and developers seeking fast, scalable, and production-ready generative AI solutions on AWS.

View product

Stable Diffusion - Create Stunning Images on Your Linux Cloud GPU Server

By NI SP - High-End Remote Desktop and HPC

Render beautiful Stable Diffusion images with great performance (Text2Image, Image2Image, Images2Video, ...). NeRF neural networks create 3D scenes from videos and images.

View product

AWS GenAI Virtual Assistant Free POC For Healthcare Companies

By Reveal HealthTech

Improve patient engagement and streamline healthcare interactions with an AI-powered virtual assistant. This FREE Proof-of-Concept (POC) lets you explore the power of AWS Generative AI services. No Commitment Required.

View product

Customer reviews

Leave a review

Ratings and reviews

Info

0 ratings

5 star

4 star

3 star

2 star

1 star

0 AWS reviews

No customer reviews yet

Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.