AWS Machine Learning Blog

Tag: Generative AI

Diagram depicting Amazon Bedrock Custom Model creation process using Terraform.

Streamline custom model creation and deployment for Amazon Bedrock with Provisioned Throughput using Terraform

As customers seek to incorporate their corpus of knowledge into their generative artificial intelligence (AI) applications, or to build domain-specific models, their data science teams often want to conduct A/B testing and have repeatable experiments. In this post, we discuss a solution that uses infrastructure as code (IaC) to define the process of retrieving and […]

Build a serverless exam generator application from your own lecture content using Amazon Bedrock

Crafting new questions for exams and quizzes can be tedious and time-consuming for educators. The time required varies based on factors like subject matter, question types, experience level, and class level. Multiple-choice questions require substantial time to generate quality distractors and ensure a single unambiguous answer, and composing effective true-false questions demands careful effort to […]

RAG architecture with Voyage AI embedding models on Amazon SageMaker JumpStart and Anthropic Claude 3 models

In this post, we provide an overview of the state-of-the-art embedding models by Voyage AI and show a RAG implementation with Voyage AI’s text embedding model on Amazon SageMaker Jumpstart, Anthropic’s Claude 3 model on Amazon Bedrock, and Amazon OpenSearch Service. Voyage AI’s embedding models are the preferred embedding models for Anthropic. In addition to general-purpose embedding models, Voyage AI offers domain-specific embedding models that are tuned to a particular domain.

Unleashing the power of generative AI: Verisk’s journey to an Instant Insight Engine for enhanced customer support

This post is co-written with Tom Famularo, Abhay Shah and Nicolette Kontor from Verisk. Verisk (Nasdaq: VRSK) is a leading data analytics and technology partner for the global insurance industry. Through advanced analytics, software, research, and industry expertise across over 20 countries, Verisk helps build resilience for individuals, communities, and businesses. The company is committed […]

Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker

Large language models (LLMs) are making a significant impact in the realm of artificial intelligence (AI). Their impressive generative abilities have led to widespread adoption across various sectors and use cases, including content generation, sentiment analysis, chatbot development, and virtual assistant technology. Llama2 by Meta is an example of an LLM offered by AWS. Llama […]

Scale AI training and inference for drug discovery through Amazon EKS and Karpenter

This is a guest post co-written with the leadership team of Iambic Therapeutics. Iambic Therapeutics is a drug discovery startup with a mission to create innovative AI-driven technologies to bring better medicines to cancer patients, faster. Our advanced generative and predictive artificial intelligence (AI) tools enable us to search the vast space of possible drug […]

Cost-effective document classification using the Amazon Titan Multimodal Embeddings Model

Organizations across industries want to categorize and extract insights from high volumes of documents of different formats. Manually processing these documents to classify and extract information remains expensive, error prone, and difficult to scale. Advances in generative artificial intelligence (AI) have given rise to intelligent document processing (IDP) solutions that can automate the document classification, […]

Build an internal SaaS service with cost and usage tracking for foundation models on Amazon Bedrock

In this post, we show you how to build an internal SaaS layer to access foundation models with Amazon Bedrock in a multi-tenant (team) architecture. We specifically focus on usage and cost tracking per tenant and also controls such as usage throttling per tenant. We describe how the solution and Amazon Bedrock consumption plans map to the general SaaS journey framework. The code for the solution and an AWS Cloud Development Kit (AWS CDK) template is available in the GitHub repository.

Architect defense-in-depth security for generative AI applications using the OWASP Top 10 for LLMs

This post provides three guided steps to architect risk management strategies while developing generative AI applications using LLMs. We first delve into the vulnerabilities, threats, and risks that arise from the implementation, deployment, and use of LLM solutions, and provide guidance on how to start innovating with security in mind. We then discuss how building on a secure foundation is essential for generative AI. Lastly, we connect these together with an example LLM workload to describe an approach towards architecting with defense-in-depth security across trust boundaries.

Amazon SageMaker model parallel library now accelerates PyTorch FSDP workloads by up to 20%

Large language model (LLM) training has surged in popularity over the last year with the release of several popular models such as Llama 2, Falcon, and Mistral. Customers are now pre-training and fine-tuning LLMs ranging from 1 billion to over 175 billion parameters to optimize model performance for applications across industries, from healthcare to finance […]