AWS Machine Learning Blog
Category: Thought Leadership
LLM continuous self-instruct fine-tuning framework powered by a compound AI system on Amazon SageMaker
In this post, we present the continuous self-instruct fine-tuning framework as a compound AI system implemented by the DSPy framework. The framework first generates a synthetic dataset from the domain knowledge base and documents for self-instruction, then drives model fine-tuning through SFT, and introduces the human-in-the-loop workflow to collect human and AI feedback to the model response, which is used to further improve the model performance by aligning human preference through reinforcement learning (RLHF/RLAIF).
Deploy DeepSeek-R1 distilled Llama models with Amazon Bedrock Custom Model Import
In this post, we demonstrate how to deploy distilled versions of DeepSeek-R1 models using Amazon Bedrock Custom Model Import. We focus on importing the variants currently supported DeepSeek-R1-Distill-Llama-8B and DeepSeek-R1-Distill-Llama-70B, which offer an optimal balance between performance and resource efficiency.
Optimizing costs of generative AI applications on AWS
Optimizing costs of generative AI applications on AWS is critical for realizing the full potential of this transformative technology. The post outlines key cost optimization pillars, including model selection and customization, token usage, inference pricing plans, and vector database considerations.
From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 2
This post focuses on doing RAG on heterogeneous data formats. We first introduce routers, and how they can help managing diverse data sources. We then give tips on how to handle tabular data and will conclude with multimodal RAG, focusing specifically on solutions that handle both text and image data.
Multilingual content processing using Amazon Bedrock and Amazon A2I
This post outlines a custom multilingual document extraction and content assessment framework using a combination of Anthropic’s Claude 3 on Amazon Bedrock and Amazon A2I to incorporate human-in-the-loop capabilities.
From RAG to fabric: Lessons learned from building real-world RAGs at GenAIIC – Part 1
In this post, we cover the core concepts behind RAG architectures and discuss strategies for evaluating RAG performance, both quantitatively through metrics and qualitatively by analyzing individual outputs. We outline several practical tips for improving text retrieval, including using hybrid search techniques, enhancing context through data preprocessing, and rewriting queries for better relevance.
Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 1
In this post, we show you how to create accurate and reliable agents. Agents helps you accelerate generative AI application development by orchestrating multistep tasks. Agents use the reasoning capability of foundation models (FMs) to break down user-requested tasks into multiple steps.
AWS recognized as a first-time Leader in the 2024 Gartner Magic Quadrant for Data Science and Machine Learning Platforms
AWS has been recognized as a Leader in the 2024 Gartner Magic Quadrant for Data Science and Machine Learning Platforms. The post highlights how AWS’s continued innovations in services like Amazon Bedrock and Amazon SageMaker have enabled organizations to unlock the transformative potential of generative AI.
How healthcare payers and plans can empower members with generative AI
In this post, we discuss how generative artificial intelligence (AI) can help health insurance plan members get the information they need. The solution presented in this post not only enhances the member experience by providing a more intuitive and user-friendly interface, but also has the potential to reduce call volumes and operational costs for healthcare payers and plans.
Enabling complex generative AI applications with Amazon Bedrock Agents
In this post, we take a closer look at Amazon Bedrock Agents. They empower you to build intelligent and context-aware generative AI applications, streamlining complex workflows and delivering natural, conversational user experiences.