Amazon SageMaker HyperPod now supports NVIDIA Multi-Instance GPU (MIG) for generative AI tasks
Amazon SageMaker HyperPod now supports NVIDIA Multi-Instance GPU (MIG) technology, enabling administrators to partition a single GPU into multiple isolated GPUs. This capability allows administrators to maximize resource utilization by running diverse, small generative AI (GenAI) tasks simultaneously on GPU partitions while maintaining performance and task isolation.
Administrators can choose either the easy-to-use configuration setup on the SageMaker HyperPod console or a custom setup approach to enable fine-grained, hardware-isolated resources for specific task requirements that don't require full GPU capacity. They can also allocate compute quota to ensure fair and efficient distribution of GPU partitions across teams. With real-time performance metrics and resource utilization monitoring dashboard across GPU partitions, administrators gain visibility to optimize resource allocation. Data scientists can now accelerate time-to-market by scheduling lightweight inference tasks and running interactive notebooks in parallel on GPU partitions, eliminating wait times for full GPU availability.
This capability is currently available for Amazon SageMaker HyperPod clusters using the EKS orchestrator across the following AWS Regions: US West (Oregon), US East (N.Virginia), US East (Ohio), US West (N. California), Canada (Central), South America (Sao Paulo), Europe (Stockholm), Europe (Spain), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Mumbai), Asia Pacific (Jakarta), Asia Pacific (Melbourne), Asia Pacific (Tokyo), Asia Pacific (Sydney), Asia Pacific (Seoul), Asia Pacific (Singapore).
To learn more, visit SageMaker HyperPod webpage, and SageMaker HyperPod documentation.