Artificial Intelligence

Mahadevan Balasubramaniam

Author: Mahadevan Balasubramaniam

Mahadevan Balasubramaniam has 24 years experience in the area of physics infused deep learning and building digital twins at scale for physical assets such as aircraft engines, industrial gas turbines and industrial process platforms. At AWS, he is a WWSO HPC Principal SA developing solutions for large-scale HPC+AI and ML Frameworks. Prior to joining AWS, Mahadevan was first at GE where he focused on probabilistic modeling, hardware design, anomaly detection, and remaining useful life predictions for a variety of applications across aviation, energy, and oil & gas. Mahadevan then joined as a Senior Principal Data Scientist for a startup where focused on deep learning based solar energy forecasting for managing the battery discharge in PV-battery installations. Dr. Balasubramaniam obtained his Ph.D. from MIT in 2001 where he studied computational geometry for automated toolpath generation for 5-axis NC machining.

Accelerate PyTorch with DeepSpeed to train large language models with Intel Habana Gaudi-based DL1 EC2 instances

Training large language models (LLMs) with billions of parameters can be challenging. In addition to designing the model architecture, researchers need to set up state-of-the-art training techniques for distributed training like mixed precision support, gradient accumulation, and checkpointing. With large models, the training setup is even more challenging because the available memory in a single […]