Pranav Murthy | Artificial Intelligence

Accelerating LLM inference with post-training weight and activation using AWQ and GPTQ on Amazon SageMaker AI

Quantized models can be seamlessly deployed on Amazon SageMaker AI using a few lines of code. In this post, we explore why quantization matters—how it enables lower-cost inference, supports deployment on resource-constrained hardware, and reduces both the financial and environmental impact of modern LLMs, while preserving most of their original performance. We also take a deep dive into the principles behind PTQ and demonstrate how to quantize the model of your choice and deploy it on Amazon SageMaker.

Architecting TrueLook’s AI-powered construction safety system on Amazon SageMaker AI

This post provides a detailed architectural overview of how TrueLook built its AI-powered safety monitoring system using SageMaker AI, highlighting key technical decisions, pipeline design patterns, and MLOps best practices. You will gain valuable insights into designing scalable computer vision solutions on AWS, particularly around model training workflows, automated pipeline creation, and production deployment strategies for real-time inference.

Introducing SOCI indexing for Amazon SageMaker Studio: Faster container startup times for AI/ML workloads

Today, we are excited to introduce a new feature for SageMaker Studio: SOCI (Seekable Open Container Initiative) indexing. SOCI supports lazy loading of container images, where only the necessary parts of an image are downloaded initially rather than the entire container.

Fine-tune OpenAI GPT-OSS models on Amazon SageMaker AI using Hugging Face libraries

Released on August 5, 2025, OpenAI’s GPT-OSS models, gpt-oss-20b and gpt-oss-120b, are now available on AWS through Amazon SageMaker AI and Amazon Bedrock. In this post, we walk through the process of fine-tuning a GPT-OSS model in a fully managed training environment using SageMaker AI training jobs.

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

In this post, we demonstrate how to optimize hosting DeepSeek-R1 distilled models with Hugging Face Text Generation Inference (TGI) on Amazon SageMaker AI.

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

Amazon SageMaker Studio provides a fully managed solution for data scientists to interactively build, train, and deploy machine learning (ML) models. In the process of working on their ML tasks, data scientists typically start their workflow by discovering relevant data sources and connecting to them. They then use SQL to explore, analyze, visualize, and integrate […]

Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools

Amazon SageMaker Studio offers a broad set of fully managed integrated development environments (IDEs) for machine learning (ML) development, including JupyterLab, Code Editor based on Code-OSS (Visual Studio Code Open Source), and RStudio. It provides access to the most comprehensive set of tools for each step of ML development, from preparing data to building, training, […]

Unlocking efficiency: Harnessing the power of Selective Execution in Amazon SageMaker Pipelines

MLOps is a key discipline that often oversees the path to productionizing machine learning (ML) models. It’s natural to focus on a single model that you want to train and deploy. However, in reality, you’ll likely work with dozens or even hundreds of models, and the process may involve multiple complex steps. Therefore, it’s important […]

Artificial Intelligence

Author: Pranav Murthy

Accelerating LLM inference with post-training weight and activation using AWQ and GPTQ on Amazon SageMaker AI

Architecting TrueLook’s AI-powered construction safety system on Amazon SageMaker AI

Introducing SOCI indexing for Amazon SageMaker Studio: Faster container startup times for AI/ML workloads

Fine-tune OpenAI GPT-OSS models on Amazon SageMaker AI using Hugging Face libraries

Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on Amazon SageMaker AI

Explore data with ease: Use SQL and Text-to-SQL in Amazon SageMaker Studio JupyterLab notebooks

Boost productivity on Amazon SageMaker Studio: Introducing JupyterLab Spaces and generative AI tools

Unlocking efficiency: Harnessing the power of Selective Execution in Amazon SageMaker Pipelines

Learn

Resources

Developers

Help