AWS Machine Learning Blog
Category: Generative AI
Accelerating insurance policy reviews with generative AI: Verisk’s Mozart companion
This post is co-authored with Sundeep Sardana, Malolan Raman, Joseph Lam, Maitri Shah and Vaibhav Singh from Verisk. Verisk (Nasdaq: VRSK) is a leading strategic data analytics and technology partner to the global insurance industry, empowering clients to strengthen operating efficiency, improve underwriting and claims outcomes, combat fraud, and make informed decisions about global risks. […]
Build a Multi-Agent System with LangGraph and Mistral on AWS
In this post, we explore how to use LangGraph and Mistral models on Amazon Bedrock to create a powerful multi-agent system that can handle sophisticated workflows through collaborative problem-solving. This integration enables the creation of AI agents that can work together to solve complex problems, mimicking humanlike reasoning and collaboration.
Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval
In this post, we discuss best practices for applying LLMs to generate ground truth for evaluating question-answering assistants with FMEval on an enterprise scale. FMEval is a comprehensive evaluation suite from Amazon SageMaker Clarify, and provides standardized implementations of metrics to assess quality and responsibility. To learn more about FMEval, see Evaluate large language models for quality and responsibility of LLMs.
Accelerate AWS Well-Architected reviews with Generative AI
In this post, we explore a generative AI solution leveraging Amazon Bedrock to streamline the WAFR process. We demonstrate how to harness the power of LLMs to build an intelligent, scalable system that analyzes architecture documents and generates insightful recommendations based on AWS Well-Architected best practices. This solution automates portions of the WAFR report creation, helping solutions architects improve the efficiency and thoroughness of architectural assessments while supporting their decision-making process.
Customize DeepSeek-R1 distilled models using Amazon SageMaker HyperPod recipes – Part 1
In this two-part series, we discuss how you can reduce the DeepSeek model customization complexity by using the pre-built fine-tuning workflows (also called “recipes”) for both DeepSeek-R1 model and its distilled variations, released as part of Amazon SageMaker HyperPod recipes. In this first post, we will build a solution architecture for fine-tuning DeepSeek-R1 distilled models and demonstrate the approach by providing a step-by-step example on customizing the DeepSeek-R1 Distill Qwen 7b model using recipes, achieving an average of 25% on all the Rouge scores, with a maximum of 49% on Rouge 2 score with both SageMaker HyperPod and SageMaker training jobs. The second part of the series will focus on fine-tuning the DeepSeek-R1 671b model itself.
Reduce conversational AI response time through inference at the edge with AWS Local Zones
This guide demonstrates how to deploy an open source foundation model from Hugging Face on Amazon EC2 instances across three locations: a commercial AWS Region and two AWS Local Zones. Through comparative benchmarking tests, we illustrate how deploying foundation models in Local Zones closer to end users can significantly reduce latency—a critical factor for real-time applications such as conversational AI assistants.
Pixtral-12B-2409 is now available on Amazon Bedrock Marketplace
In this post, we walk through how to discover, deploy, and use the Mistral AI Pixtral 12B model for a variety of real-world vision use cases.
Streamline work insights with the Amazon Q Business connector for Smartsheet
This post explains how to integrate Smartsheet with Amazon Q Business to use natural language and generative AI capabilities for enhanced insights. Smartsheet, the AI-enhanced enterprise-grade work management platform, helps users manage projects, programs, and processes at scale.
Level up your problem-solving and strategic thinking skills with Amazon Bedrock
In this post, we show how Anthropic’s Claude 3.5 Sonnet in Amazon Bedrock can be used for a variety of business-related cognitive tasks, such as problem-solving, critical thinking and ideation—to help augment human thinking and improve decision-making among knowledge workers to accelerate innovation.
Evaluate healthcare generative AI applications using LLM-as-a-judge on AWS
In this post, we demonstrate how to implement this evaluation framework using Amazon Bedrock, compare the performance of different generator models, including Anthropic’s Claude and Amazon Nova on Amazon Bedrock, and showcase how to use the new RAG evaluation feature to optimize knowledge base parameters and assess retrieval quality.