- Events
- AWS Gen AI Loft | Model Hosting on Amazon SageMaker Masterclass
AWS Gen AI Loft | Model Hosting on Amazon SageMaker Masterclass
KI
AWS GenAI Loft | Bengaluru
Generative KI
Machine Learning
SageMaker
-
-
PERSÖNLICH
Sahil Verma | Sr. GTM GenAI Specialist Solutions Architect, AWS
English
300 – Fortgeschritten, 400 – Experte
Speakers
Are you a machine learning (ML) engineer or platform architect looking to master production-grade model deployment and serving infrastructure? Join us for a comprehensive hands-on workshop exploring modern model serving frameworks and Amazon SageMaker's advanced hosting capabilities to build scalable, cost-effective ML inference systems.
Model serving is the critical bridge between trained models and real-world applications, yet many organizations struggle with choosing the right frameworks, optimizing performance, and managing costs at scale. This workshop combines theoretical foundations with practical implementation, covering popular serving frameworks alongside Amazon SageMaker's managed hosting solutions.
In this intensive session, you'll gain deep expertise in designing robust model serving architectures that can handle production workloads while maintaining high availability and cost efficiency. We'll explore multi-model endpoints, auto-scaling strategies, and advanced optimization techniques used by leading ML teams.
Who is this for? This workshop is ideal for:
- ML Engineers and MLOps professionals building production inference systems
- Platform architects designing scalable ML infrastructure
- Data scientists transitioning models from research to production
- DevOps engineers managing ML serving infrastructure
- Technical leads evaluating model deployment strategies
- Software engineers integrating ML models into applications
- Solutions architects designing end-to-end ML platforms
During this hands-on workshop, you'll master the art and science of model serving, from framework selection to production deployment strategies. You'll work with real models and scenarios to understand the nuances of different serving approaches.
Key highlights:
- Deep dive into popular model serving frameworks: TorchServe, TensorFlow Serving, Triton Inference Server, and Ray Serve
- Hands-on implementation of multi-model endpoints and batching strategies
- Master Amazon SageMaker's hosting options: Real-time endpoints, Serverless inference, and Batch transform
- Advanced optimization techniques including model compilation, quantization, and caching
- Cost optimization strategies: Auto-scaling, spot instances, and resource allocation
- Performance monitoring and troubleshooting production serving issues
- A/B testing and traffic routing for model deployments
- Security best practices for model serving in production environments
- Real-world case studies from high-scale ML deployments
This workshop is specifically designed for teams building production ML systems who need to make informed decisions about serving architecture. You'll leave with practical experience and a toolkit of strategies for deploying models at scale.
Prerequisites:
- Laptop with adequate specifications for hands-on exercises
- Strong understanding of machine learning concepts and model training
- Proficiency in Python programming and familiarity with ML frameworks (PyTorch, TensorFlow)
- AWS account access with appropriate permissions
- Experience with containerization (Docker) and basic DevOps concepts
- Understanding of REST APIs and web service architectures
- Basic knowledge of cloud computing and infrastructure concepts
By registering, you agree to the AWS Event Terms & Conditions and AWS Code of Conduct.