Data Preparation for Generative AI Training

Amazon Web Services

In this comprehensive video, AWS generative AI expert Emily Webber demonstrates how to prepare data and train at scale using Amazon Web Services. She covers multiple options for data preparation, including S3 buckets, ECR images, FSx for Lustre, and SageMaker. Webber explains how to set up distributed file systems, use SageMaker warm pools for efficient development, and scale up training runs. The video includes a hands-on walkthrough of creating SageMaker warm pools and running them with FSx for Lustre, as well as troubleshooting tips for large-scale distributed training. Viewers will learn how to optimize their workflow for training foundation models and generative AI systems on AWS infrastructure.

product-information

skills-and-how-to

generative-ai

ai-ml

gen-ai

Show 7 more