Amazon Web Services

In this comprehensive video, AWS Machine Learning specialist Emily Webber explores various options for deploying foundation models on AWS, focusing on Amazon SageMaker. She covers online, offline, queued, embedded, and serverless application types, explaining their tradeoffs. The video demonstrates how to host distributed models across multiple accelerators and optimize performance through techniques like model compression. Emily provides a hands-on walkthrough of deploying a 175 billion parameter BLOOM model using SageMaker's large model inference container. She discusses key concepts like tensor parallelism and offers practical tips for efficient model deployment and serving. The video concludes with a demo of invoking the deployed model for inference.

product-information
skills-and-how-to
generative-ai
ai-ml
compute
Show 7 more

Up Next

VideoThumbnail
8:42

สร้าง Web application ใช้ AWS Amplify (Level 200)

Jun 26, 2025
VideoThumbnail
4:38

วิธีการสร้าง Amazon Machine Image (AMI) (Level 200)

Jun 26, 2025
VideoThumbnail
8:03

การย้ายข้อมูลบนระบบฐานข้อมูลด้วย AWS DMS และ AWS SCT (Level 200)

Jun 26, 2025
VideoThumbnail
8:24

เริ่มต้นใช้งาน Technology Serverless ด้วย AWS Lambda (Level 200)

Jun 26, 2025
VideoThumbnail
7:52

วิธีการเซ็ตอัพและการใช้งาน Amazon WorkSpaces (Level 200)

Jun 26, 2025