AWS Machine Learning Blog
Category: Thought Leadership
Build a generative AI enabled virtual IT troubleshooting assistant using Amazon Q Business
Discover how to build a GenAI powered virtual IT troubleshooting assistant using Amazon Q Business. This innovative solution integrates with popular ITSM tools like ServiceNow, Atlassian Jira, and Confluence to streamline information retrieval and enhance collaboration across your organization. By harnessing the power of generative AI, this assistant can significantly boost operational efficiency and provide 24/7 support tailored to individual needs. Learn how to set up, configure, and leverage this solution to transform your enterprise information management.
Unleash AI innovation with Amazon SageMaker HyperPod
In this post, we show how SageMaker HyperPod, and its new features introduced at AWS re:Invent 2024, is designed to meet the demands of modern AI workloads, offering a persistent and optimized cluster tailored for distributed training and accelerated inference at cloud scale and attractive price-performance.
Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval
In this post, we discuss best practices for applying LLMs to generate ground truth for evaluating question-answering assistants with FMEval on an enterprise scale. FMEval is a comprehensive evaluation suite from Amazon SageMaker Clarify, and provides standardized implementations of metrics to assess quality and responsibility. To learn more about FMEval, see Evaluate large language models for quality and responsibility of LLMs.
LLM continuous self-instruct fine-tuning framework powered by a compound AI system on Amazon SageMaker
In this post, we present the continuous self-instruct fine-tuning framework as a compound AI system implemented by the DSPy framework. The framework first generates a synthetic dataset from the domain knowledge base and documents for self-instruction, then drives model fine-tuning through SFT, and introduces the human-in-the-loop workflow to collect human and AI feedback to the model response, which is used to further improve the model performance by aligning human preference through reinforcement learning (RLHF/RLAIF).
Deploy DeepSeek-R1 distilled Llama models with Amazon Bedrock Custom Model Import
In this post, we demonstrate how to deploy distilled versions of DeepSeek-R1 models using Amazon Bedrock Custom Model Import. We focus on importing the variants currently supported DeepSeek-R1-Distill-Llama-8B and DeepSeek-R1-Distill-Llama-70B, which offer an optimal balance between performance and resource efficiency.
Optimizing costs of generative AI applications on AWS
Optimizing costs of generative AI applications on AWS is critical for realizing the full potential of this transformative technology. The post outlines key cost optimization pillars, including model selection and customization, token usage, inference pricing plans, and vector database considerations.
From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2
This post focuses on doing RAG on heterogeneous data formats. We first introduce routers, and how they can help managing diverse data sources. We then give tips on how to handle tabular data and will conclude with multimodal RAG, focusing specifically on solutions that handle both text and image data.
Multilingual content processing using Amazon Bedrock and Amazon A2I
This post outlines a custom multilingual document extraction and content assessment framework using a combination of Anthropic’s Claude 3 on Amazon Bedrock and Amazon A2I to incorporate human-in-the-loop capabilities.
From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 1
In this post, we cover the core concepts behind RAG architectures and discuss strategies for evaluating RAG performance, both quantitatively through metrics and qualitatively by analyzing individual outputs. We outline several practical tips for improving text retrieval, including using hybrid search techniques, enhancing context through data preprocessing, and rewriting queries for better relevance.
Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 1
In this post, we show you how to create accurate and reliable agents. Agents helps you accelerate generative AI application development by orchestrating multistep tasks. Agents use the reasoning capability of foundation models (FMs) to break down user-requested tasks into multiple steps.