Listing Thumbnail

    HUGS (Hugging Face Generative AI Services)

     Info
    Free Trial
    Hugging Face Generative AI Services (HUGS) are optimized, zero-configuration inference and training microservices designed to simplify and accelerate the development of AI models. Built on open-source Hugging Face technologies such as Text Generation Inference and Transformers.
    Listing Thumbnail

    HUGS (Hugging Face Generative AI Services)

     Info

    Overview

    Hugging Face Generative AI Microservices (HUGS Inference) empowers you to rapidly deploy and scale open-source generative AI models with zero configuration. Leveraging optimized inference engines for leading hardware like NVIDIA GPUs, AMD GPUs, Intel GPUs, AWS Inferentia, Habana Gaudi, and Google TPUs, HUGS delivers unparalleled performance. Seamlessly integrate models via the industry-standard OpenAI API, simplifying development with tools like LangChain and LlamaIndex. Focus on building cutting-edge applications, not complex deployments. Benefit from enterprise-grade security, control, and compliance features, including SLAs and SOC2.

    HUGS Inference supports a diverse range of popular open-source LLMs, multimodal, and embedding models, including Meta-Llama, Mistral, Qwen, Gemma, and more. Choose between optimized container versions (turbo and light) to balance performance and resource requirements. Deploy pre-configured microservices tailored to your hardware, eliminating manual setup and maximizing efficiency. With HUGS, go from concept to production in minutes, not weeks.

    Keywords: Generative AI, Inference, Microservices, Open-Source Models, LLMs, Multimodal Models, Embedding Models, Zero-Configuration, Optimized Inference, NVIDIA GPUs, AMD GPUs, Intel GPUs, AWS Inferentia, Habana Gaudi, Google TPUs, OpenAI API, LangChain, LlamaIndex, Kubernetes, Scalability, Security, Compliance, SLA, SOC2, Enterprise-Ready, Hugging Face, Text Generation Inference, Transformers, Meta-Llama, Mistral, Qwen, Gemma.

    Highlights

    • Optimized to run open LLMs on NVIDIA GPU and AWS Accelerators with Inferentia and Trainium
    • powered by Hugging Face Open Source Technologies, like Text Generation Inference
    • Build for Enterprise on standardized on the OpenAI API, enabling companies to switch from closed models to open models with a single configuration change

    Details

    Delivery method

    Delivery option
    open LLMs for NVIDIA GPUs

    Latest version

    Operating system
    Linux

    Pricing

    Free trial

    Try this product at no cost for 5 days according to the free trial terms set by the vendor.

    HUGS (Hugging Face Generative AI Services)

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.

    Usage costs (1)

     Info
    Dimension
    Description
    Cost/unit/hour
    Hours
    Container Hours
    $1.00

    Vendor refund policy

    If you are not satisfied with your purchase, you may request a refund by providing a detailed explanation and contacting us at api-enterprise@huggingface.co .

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    open LLMs for NVIDIA GPUs

    Supported services: Learn more 
    • Amazon EKS
    • Amazon ECS
    • Amazon ECS Anywhere
    • Amazon EKS Anywhere
    Container image

    Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.

    Version release notes

    Initial Release. Supporting HUGS Container for NVIDIA GPUs, including Meta Llama, Google Gemma, Mistral, Qwen and more.

    Additional details

    Usage instructions

    Instructions how to deploy container using AWS EKS and our helm chart https://github.com/huggingface/hugs-helm-chart/blob/0.0.1/aws/README.md 

    Support

    Vendor support

    Please contact api-enterprise@huggingface.co  for support

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Similar products

    Customer reviews

    Ratings and reviews

     Info
    0 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    0%
    0%
    0%
    0 AWS reviews
    No customer reviews yet
    Be the first to write a review for this product.