Amazon Web Services

In this comprehensive video, AWS Machine Learning specialist Emily Webber explores various options for deploying foundation models on AWS, focusing on Amazon SageMaker. She covers online, offline, queued, embedded, and serverless application types, explaining their tradeoffs. The video demonstrates how to host distributed models across multiple accelerators and optimize performance through techniques like model compression. Emily provides a hands-on walkthrough of deploying a 175 billion parameter BLOOM model using SageMaker's large model inference container. She discusses key concepts like tensor parallelism and offers practical tips for efficient model deployment and serving. The video concludes with a demo of invoking the deployed model for inference.

product-information
skills-and-how-to
generative-ai
ai-ml
compute
Show 7 more

Up Next

VideoThumbnail
30:23

T3-2 Amazon SageMaker Canvasで始めるノーコード機械学習 (Level 200)

Jun 27, 2025
VideoThumbnail
31:49

T2-3 AWS を使った生成 AI アプリケーション開発 (Level 300)

Jun 27, 2025
VideoThumbnail
26:05

T4-4: AWS 認定 受験準備の進め方 AWS Certified Solutions Architect – Associate 編 後半

Jun 26, 2025
VideoThumbnail
32:15

T3-1: はじめてのコンテナワークロード - AWS でのコンテナ活用の第一歩

Jun 26, 2025
VideoThumbnail
29:37

BOS-09: はじめてのサーバーレス - AWS Lambda でサーバーレスアプリケーション開発 (Level 200)

Jun 26, 2025