Overview
With Hugging Face on AWS, you can access, evaluate, customize, and deploy hundreds of publicly available foundation models (FMs) through Amazon SageMaker on NVIDIA GPUs, as well as purpose-built AI chips AWS Trainium and AWS Inferentia, in a matter of clicks. These easy-to-use flows which are supported on the most popular FMs in the Hugging Face model hub allow you to further optimize the performance of their models for their specific use cases while significantly lowering costs. Code snippets for Sagemaker are available on every model page on the model hub under the Train and Deploy dropdown menus.
Behind the scenes, these experiences are built on top of the Hugging Face AWS Deep Learning Containers (DLCs), which provide you a fully managed experience for building, training, and deploying state-of-the-art FMs using Amazon SageMaker. These DLCs remove the need to package dependencies and optimize your ML workload for the targeted hardware. For example, AWS and Hugging Face collaborate on the open-source Optimum Neuron library which is packaged in the DLCs built for AWS AI chips to deliver price performance benefits with minimal overhead.
Benefits
Use cases
Content summarization
Produce concise summaries of articles, blog posts, and documents to identify the most important information, highlight key takeaways, and more quickly distill information. Hugging Face provides a variety of models for content summarization, including Meta Llama 3.
Chat support or virtual assistants
Streamline customer self-service processes and reduce operational costs by automating responses for customer service queries through generative AI-powered chat support and virtual assistants. Hugging Face provides models that can be used for chat support or virtual assistants, including instruction-tuned Meta Llama 3 and Falcon 2 models.
Content generation
Create personalized, engaging, and high-quality content, such as short stories, essays, blogs, social media posts, images, and web page copy. Hugging Face provides models for content generation, including Mistral.
Code generation
Accelerate application development with code suggestions. Hugging Face provides models that can be used code generation, including StarCoder.
Document vectorization
By vectorizing documents with embedding models, you unlock powerful capabilities for information retrieval, question answering, semantic search, contextual recommendations, and document clustering. These applications enhance the way users interact with information, making it easier to discover, explore, and leverage relevant knowledge from large document collections.