Amazon Web Services

In this comprehensive video, AWS Machine Learning specialist Emily Webber explores various options for deploying foundation models on AWS, focusing on Amazon SageMaker. She covers online, offline, queued, embedded, and serverless application types, explaining their tradeoffs. The video demonstrates how to host distributed models across multiple accelerators and optimize performance through techniques like model compression. Emily provides a hands-on walkthrough of deploying a 175 billion parameter BLOOM model using SageMaker's large model inference container. She discusses key concepts like tensor parallelism and offers practical tips for efficient model deployment and serving. The video concludes with a demo of invoking the deployed model for inference.

product-information
skills-and-how-to
generative-ai
ai-ml
compute
Show 7 more

Up Next

VideoThumbnail
30:02

Builders 온라인 시리즈 | Amazon VPC와 온프레미스 네트워크 연결하기

Jun 27, 2025
VideoThumbnail
26:52

Builders 온라인 시리즈 | 당신의 아키텍처는 Well-Architected 한가요?

Jun 27, 2025
VideoThumbnail
28:50

완전관리형 컨테이너 서비스 Amazon ECS로 애플리케이션 쉽게 구축하기 - AWS TechCamp

Jun 26, 2025
VideoThumbnail
18:39

기초부터 배우는 AWS 핵심 서비스로 웹 애플리케이션 구축하기 - AWS TechCamp

Jun 26, 2025
VideoThumbnail
18:56

Amazon Bedrock을 활용하여 상품리뷰 요약과 비디오 숏폼 만들기 - AWS TechCamp

Jun 26, 2025