AWS Machine Learning Blog
Category: Announcements
Amazon Q Apps supports customization and governance of generative AI-powered apps
In this post, we examine how these features enhance the capabilities of Amazon Q Apps. We explore the new customization options, detailing how these advancements make Amazon Q Apps more accessible and applicable to a wider range of enterprise customers. We focus on key features such as custom labels, verified apps, private sharing, and data collection apps (preview).
Pixtral 12B is now available on Amazon SageMaker JumpStart
Today, we are excited to announce that Pixtral 12B (pixtral-12b-2409), a state-of-the-art vision language model (VLM) from Mistral AI that excels in both text-only and multimodal tasks, is available for customers through Amazon SageMaker JumpStart. You can try this model with SageMaker JumpStart, a machine learning (ML) hub that provides access to algorithms and models that can be deployed with one click for running inference. In this post, we walk through how to discover, deploy, and use the Pixtral 12B model for a variety of real-world vision use cases.
Mistral-NeMo-Instruct-2407 and Mistral-NeMo-Base-2407 are now available on SageMaker JumpStart
Today, we are excited to announce that Mistral-NeMo-Base-2407 and Mistral-NeMo-Instruct-2407 large language models from Mistral AI that excel at text generation, are available for customers through Amazon SageMaker JumpStart. In this post, we walk through how to discover, deploy and use the Mistral-NeMo-Instruct-2407 and Mistral-NeMo-Base-2407 models for a variety of real-world use cases.
Advancing AI trust with new responsible AI tools, capabilities, and resources
With trust as a cornerstone of AI adoption, we are excited to announce at AWS re:Invent 2024 new responsible AI tools, capabilities, and resources that enhance the safety, security, and transparency of our AI services and models and help support customers’ own responsible AI journeys.
Speed up your cluster procurement time with Amazon SageMaker HyperPod training plans
In this post, we explore how Amazon SageMaker HyperPod training plans accelerate compute resource procurement for machine learning workloads. We guide you through a step-by-step implementation on how you can use the AWS CLI or the AWS Management Console to find, review, and create optimal training plans for your specific compute and timeline needs. We further guide you through using the training plan to submit SageMaker training jobs or create SageMaker HyperPod clusters.
Amazon Bedrock Marketplace now includes NVIDIA models: Introducing NVIDIA Nemotron-4 NIM microservices
At AWS re:Invent 2024, we are excited to introduce Amazon Bedrock Marketplace. This a revolutionary new capability within Amazon Bedrock that serves as a centralized hub for discovering, testing, and implementing foundation models (FMs). In this post, we discuss the advantages and capabilities of Amazon Bedrock Marketplace and Nemotron models, and how to get started.
Use Amazon Bedrock tooling with Amazon SageMaker JumpStart models
In this post, we explore how to deploy AI models from SageMaker JumpStart and use them with Amazon Bedrock’s powerful features. Users can combine SageMaker JumpStart’s model hosting with Bedrock’s security and monitoring tools. We demonstrate this using the Gemma 2 9B Instruct model as an example, showing how to deploy it and use Bedrock’s advanced capabilities.
Amazon SageMaker launches the updated inference optimization toolkit for generative AI
Today, Amazon SageMaker is excited to announce updates to the inference optimization toolkit, providing new functionality and enhancements to help you optimize generative AI models even faster.In this post, we discuss these new features of the toolkit in more detail.
Speed up your AI inference workloads with new NVIDIA-powered capabilities in Amazon SageMaker
At re:Invent 2024, we are excited to announce new capabilities to speed up your AI inference workloads with NVIDIA accelerated computing and software offerings on Amazon SageMaker. In this post, we will explore how you can use these new capabilities to enhance your AI inference on Amazon SageMaker. We’ll walk through the process of deploying NVIDIA NIM microservices from AWS Marketplace for SageMaker Inference. We’ll then dive into NVIDIA’s model offerings on SageMaker JumpStart, showcasing how to access and deploy the Nemotron-4 model directly in the JumpStart interface. This will include step-by-step instructions on how to find the Nemotron-4 model in the JumpStart catalog, select it for your use case, and deploy it with a few clicks.
Supercharge your auto scaling for generative AI inference – Introducing Container Caching in SageMaker Inference
Today at AWS re:Invent 2024, we are excited to announce the new Container Caching capability in Amazon SageMaker, which significantly reduces the time required to scale generative AI models for inference. This innovation allows you to scale your models faster, observing up to 56% reduction in latency when scaling a new model copy and up to 30% when adding a model copy on a new instance. In this post, we explore the new Container Caching feature for SageMaker inference, addressing the challenges of deploying and scaling large language models (LLMs).