AWS Machine Learning Blog

Category: Advanced (300)

Efficient and cost-effective multi-tenant LoRA serving with Amazon SageMaker

In this post, we explore a solution that addresses these challenges head-on using LoRA serving with Amazon SageMaker. By using the new performance optimizations of LoRA techniques in SageMaker large model inference (LMI) containers along with inference components, we demonstrate how organizations can efficiently manage and serve their growing portfolio of fine-tuned models, while optimizing costs and providing seamless performance for their customers. The latest SageMaker LMI container offers unmerged-LoRA inference, sped up with our LMI-Dist inference engine and OpenAI style chat schema. To learn more about LMI, refer to LMI Starting Guide, LMI handlers Inference API Schema, and Chat Completions API Schema.

Build a serverless exam generator application from your own lecture content using Amazon Bedrock

Crafting new questions for exams and quizzes can be tedious and time-consuming for educators. The time required varies based on factors like subject matter, question types, experience level, and class level. Multiple-choice questions require substantial time to generate quality distractors and ensure a single unambiguous answer, and composing effective true-false questions demands careful effort to […]

Incorporate offline and online human – machine workflows into your generative AI applications on AWS

Recent advances in artificial intelligence have led to the emergence of generative AI that can produce human-like novel content such as images, text, and audio. These models are pre-trained on massive datasets and, to sometimes fine-tuned with smaller sets of more task specific data. An important aspect of developing effective generative AI application is Reinforcement […]

Transform customer engagement with no-code LLM fine-tuning using Amazon SageMaker Canvas and SageMaker JumpStart

Fine-tuning large language models (LLMs) creates tailored customer experiences that align with a brand’s unique voice. Amazon SageMaker Canvas and Amazon SageMaker JumpStart democratize this process, offering no-code solutions and pre-trained models that enable businesses to fine-tune LLMs without deep technical expertise, helping organizations move faster with fewer technical resources. SageMaker Canvas provides an intuitive […]

How Dialog Axiata used Amazon SageMaker to scale ML models in production with AI Factory and reduced customer churn within 3 months

The telecommunications industry is more competitive than ever before. With customers able to easily switch between providers, reducing customer churn is a crucial priority for telecom companies who want to stay ahead. To address this challenge, Dialog Axiata has pioneered a cutting-edge solution called the Home Broadband (HBB) Churn Prediction Model. This post explores the […]

Boost employee productivity with automated meeting summaries using Amazon Transcribe, Amazon SageMaker, and LLMs from Hugging Face

This post presents a solution to automatically generate a meeting summary from a recorded virtual meeting (for example, using Amazon Chime) with several participants. The recording is transcribed to text using Amazon Transcribe and then processed using Amazon SageMaker Hugging Face containers to generate the meeting summary. The Hugging Face containers host a large language model (LLM) from the Hugging Face Hub.

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

Large language models (LLMs) are making a significant impact in the realm of artificial intelligence (AI). Their impressive generative abilities have led to widespread adoption across various sectors and use cases, including content generation, sentiment analysis, chatbot development, and virtual assistant technology. Llama2 by Meta is an example of an LLM offered by AWS. Llama […]

Solution architecture

Deploy a Hugging Face (PyAnnote) speaker diarization model on Amazon SageMaker as an asynchronous endpoint

Speaker diarization, an essential process in audio analysis, segments an audio file based on speaker identity. This post delves into integrating Hugging Face’s PyAnnote for speaker diarization with Amazon SageMaker asynchronous endpoints. We provide a comprehensive guide on how to deploy speaker segmentation and clustering solutions using SageMaker on the AWS Cloud.

Enhance conversational AI with advanced routing techniques with Amazon Bedrock

Conversational artificial intelligence (AI) assistants are engineered to provide precise, real-time responses through intelligent routing of queries to the most suitable AI functions. With AWS generative AI services like Amazon Bedrock, developers can create systems that expertly manage and respond to user requests. Amazon Bedrock is a fully managed service that offers a choice of […]

Improve accuracy of Amazon Rekognition Face Search with user vectors

In various industries, such as financial services, telecommunications, and healthcare, customers use a digital identity process, which usually involves several steps to verify end-users during online onboarding or step-up authentication. An example of one step that can be used is face search, which can help determine whether a new end-user’s face matches those associated with […]