CUSTOMER STORY

How Hugging Face is helping companies embrace open models

by AWS Editorial Team | 21 Feb 2025 | Thought Leadership

Overview

Open-source foundation models (FMs) have accelerated at breakneck speed over the past year and a half, quickly catching up with their closed model counterparts. Engineers now have over a million freely available models at their fingertips, many of which perform on par with the best closed models available. Once just the domain of individuals, usage of open models is expanding to enterprises—including the likes of Fortune 500 companies.

This public library of models is prized by the community, offering the ability to control costs, use transparent data sets, and access specialist models. But while anyone can freely use open-source models, the challenges of going to production have held back potential. For an experienced machine learning (ML) engineer, the process takes at least a week of hard work involving multiple complex decisions around the graphics processing units (GPUs), backend, and deployment.

On a mission to make AI available to everyone, a leading open-source platform Hugging Face is breaking down these barriers. As Jeff Boudier, Head of Product at Hugging Face, says, “Our goal is to enable every company in the world to build their own AI”. Recently launching Hugging Face Generative AI Services (also affectionately known as HUGS), the company is tackling the time-consuming and tricky task of deploying a full production open model.

Hugging Face Illustration

Open models, straight out of the box

Hugging Face’s big vision when it first began was to “allow anyone to have a fun conversation with a machine learning model,” as Boudier says. While this may have been a bold ambition back in 2016, the business is materializing that vision today by making the deployment of cutting-edge technologies accessible to both individuals and companies.

Previously, businesses have built proof-of-concepts (POCs) using closed models not because it’s their top choice—but because it has been the quickest and easiest route. Developing an AI application using open models typically involves a lot of trial and error as engineers figure out everything from configuration to compilation. To address performance and compliance requirements, they must adjust libraries, versions, and parameters.

With HUGS, organizations can bypass the headaches of developing an AI application with open models. The plug and play solution is changing the game for those looking to seize the generative AI advantage. No configuration is needed, meaning they can simply take an open model and run with it. What used to take weeks now just takes a matter of minutes as models are automatically optimized for the GPU or AI accelerator.

High performance without budget cuts

Throughout Hugging Face’s journey towards democratizing AI, their collaboration with AWS has helped them scale from an early-stage startup to a front runner in the space with AI models used by millions every month. As these models continue to make headway and businesses increasingly pursue the benefits, HUGS offers them access to a hand-picked, manually benchmarked collection of the highest performing and latest open large language models (LLMs).

Hugging Face’s most recent collaboration with Amazon Web Services (AWS) has meant that businesses no longer need to make trade-offs among cost, performance, and speed of deployment. Now that the solution is available on AWS Inferentia2 AI chips, developers can further optimize the performance of models for lower latency and higher throughput, all while saving up to 40 percent in inference costs. And that’s not the only way they are making generative AI applications more accessible to companies of all sizes. By working together on the open source Optimum Neuron library, businesses get the benefits of HUGS while keeping overheads minimal.

Powering commercial impact

From building virtual assistants to creating captivating content in seconds—Hugging Face’s models cover a wealth of use cases. While these models perform well against academic benchmarks, Boudier says that even greater value can be gained from customization, “What matters for your use case is different. With fine-tuning and reinforcement learning, you can improve upon open models and make them so much better than closed models”.

Using AWS Inferentia2 on Amazon SageMaker, Hugging Face models can be customized to enhance model quality for specific tasks and enable production workloads at scale. The solution also makes it easier for developers to make an immediate impact when improving models using a range of techniques, including prompt engineering, retrieval augmented generation (RAG), and more.

Large enterprises like Thomson Reuters are already securely and effectively scaling open models on AWS. Now with HUGS and AWS Inferentia2, they have the optimized hardware to confidently and quickly build generative AI applications—in turn reaping the value faster. Available on AWS Marketplace and with seamless AWS infrastructure integration, developers can easily find, subscribe to, and deploy open models on their terms.

As Hugging Face makes strides towards making open models viable for everyone, they are expanding the LLMs available via HUGS to keep customers’ applications at the cutting edge. By offering more open-source options and simplifying usage, companies now have the freedom to choose between open and closed models.