Artificial Intelligence
Category: Amazon SageMaker
Llama 3.1 models are now available in Amazon SageMaker JumpStart
Today, we are excited to announce that the state-of-the-art Llama 3.1 collection of multilingual large language models (LLMs), which includes pre-trained and instruction tuned generative AI models in 8B, 70B, and 405B sizes, is available through Amazon SageMaker JumpStart to deploy for inference. Llama is a publicly accessible LLM designed for developers, researchers, and businesses to build, experiment, and responsibly scale their generative artificial intelligence (AI) ideas. In this post, we walk through how to discover and deploy Llama 3.1 models using SageMaker JumpStart.
Amazon SageMaker unveils the Cohere Command R fine-tuning model
AWS announced the availability of the Cohere Command R fine-tuning model on Amazon SageMaker. This latest addition to the SageMaker suite of machine learning (ML) capabilities empowers enterprises to harness the power of large language models (LLMs) and unlock their full potential for a wide range of applications. Cohere Command R is a scalable, frontier […]
Governing the ML lifecycle at scale, Part 2: Multi-account foundations
Your multi-account strategy is the core of your foundational environment on AWS. Design decisions around your multi-account environment are critical for operating securely at scale. Grouping your workloads strategically into multiple AWS accounts enables you to apply different controls across workloads, track cost and usage, reduce the impact of account limits, and mitigate the complexity […]
Improve RAG accuracy with fine-tuned embedding models on Amazon SageMaker
This post demonstrates how to use Amazon SageMaker to fine tune a Sentence Transformer embedding model and deploy it with an Amazon SageMaker Endpoint. The code from this post and more examples are available in the GitHub repo.
How BRIA AI used distributed training in Amazon SageMaker to train latent diffusion foundation models for commercial use
This post is co-written with Bar Fingerman from BRIA AI. This post explains how BRIA AI trained BRIA AI 2.0, a high-resolution (1024×1024) text-to-image diffusion model, on a dataset comprising petabytes of licensed images quickly and economically. Amazon SageMaker training jobs and Amazon SageMaker distributed training libraries took on the undifferentiated heavy lifting associated with infrastructure […]
Create custom images for geospatial analysis with Amazon SageMaker Distribution in Amazon SageMaker Studio
This post shows you how to extend Amazon SageMaker Distribution with additional dependencies to create a custom container image tailored for geospatial analysis. Although the example in this post focuses on geospatial data science, the methodology presented can be applied to any kind of custom image based on SageMaker Distribution.
Empowering everyone with GenAI to rapidly build, customize, and deploy apps securely: Highlights from the AWS New York Summit
See how AWS is democratizing generative AI with innovations like Amazon Q Apps to make AI apps from prompts, Amazon Bedrock upgrades to leverage more data sources, new techniques to curtail hallucinations, and AI skills training.
Achieve up to ~2x higher throughput while reducing costs by up to ~50% for generative AI inference on Amazon SageMaker with the new inference optimization toolkit – Part 2
As generative artificial intelligence (AI) inference becomes increasingly critical for businesses, customers are seeking ways to scale their generative AI operations or integrate generative AI models into existing workflows. Model optimization has emerged as a crucial step, allowing organizations to balance cost-effectiveness and responsiveness, improving productivity. However, price-performance requirements vary widely across use cases. For […]
Achieve up to ~2x higher throughput while reducing costs by ~50% for generative AI inference on Amazon SageMaker with the new inference optimization toolkit – Part 1
Today, Amazon SageMaker announced a new inference optimization toolkit that helps you reduce the time it takes to optimize generative artificial intelligence (AI) models from months to hours, to achieve best-in-class performance for your use case. With this new capability, you can choose from a menu of optimization techniques, apply them to your generative AI […]
The Weather Company enhances MLOps with Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch
In this post, we share the story of how The Weather Company (TWCo) enhanced its MLOps platform using services such as Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch. TWCo data scientists and ML engineers took advantage of automation, detailed experiment tracking, integrated training, and deployment pipelines to help scale MLOps effectively. TWCo reduced infrastructure management time by 90% while also reducing model deployment time by 20%.









