AWS Machine Learning Blog
AWS Deep Learning Challenge sees innovative and impactful use of Amazon EC2 DL1 instances
In the AWS Deep Learning Challenge held from January 5, 2022, to March 1, 2022, participants from academia, startups, and enterprise organizations joined to test their skills and train a deep learning model of their choice using Amazon Elastic Compute Cloud (Amazon EC2) DL1 instances and Habana’s SynapseAI SDK. The EC2 DL1 instances powered by Gaudi accelerators from Habana Labs, an Intel company, are designed specifically for training deep learning models. Participants were able to realize the significant price/performance benefits that DL1 offers over GPU-based instances.
We are excited to announce the winners and showcase some of the machine learning (ML) models that were trained in this hackathon. You will learn about some of the deep learning use cases that are supported by EC2 DL1 instances, including computer vision, natural language processing, and acoustic modeling.
Winning models
Our first-place winner is a project submitted by Gustavo Zomer. It’s an implementation of multi-lingual CLIP (Contrastive Language-Image Pre-Training). CLIP was introduced by OpenAI in 2021 as a way to train a more generalizable image classifier across larger datasets through self-supervised learning. It’s trained on a large set of images with a wide variety of natural language supervision that’s abundantly available on the internet, but is limited to the English language. This project replaces the text encoder in CLIP with a multi-lingual text encoder called XLM-RoBERTa to broaden the model’s applicability to multiple languages. This modified implementation of CLIP is able to pair images with captions across multiple languages. The model was trained on 16 accelerators across two DL1 instances, showing how ML training can be scaled to use multiple Gaudi accelerators across multiple nodes to increase training throughput and reduce the time to train. The judges were impressed by the impactful use of deep learning to break down language barriers, and the technical implementation, which used distributed training.
In second place, we have a project submitted by Remco van Akker. It uses a GAN (Generative Adversarial Network) to generate synthetic retinal image data for medical applications. Synthetic data is used in model training in medical applications to overcome the scarcity of annotated medical data, which is labor-intensive and costly to produce. Synthetic data can be used as part of data augmentation to remove biases and make vision models in medical applications more generalizable. This project stood out because it implemented a generative model on DL1 to solve a real-world problem impacting the application of AI and ML in healthcare.
Rounding out our top three was a project submitted by Zohar Jackson that implemented a vision transformer model for semantic segmentation. This project uses the Ray Tune library to fine-tune hyperparameters and uses Horovod to parallelize training on 16 Gaudi accelerators across two DL1 instances.
In addition to the top three winners, participants won several other prizes, including best technical implementation, highest potential impact, and most creative project. We offer our congratulations to all the winners of this hackathon for building such a diverse set of impactful projects on Gaudi accelerator-based EC2 DL1 instances. We can’t wait to see what our participants will continue to build on DL1 instances going forward.
Get started with DL1 instances
As demonstrated by the various projects in this hackathon, you can use EC2 DL1 instances to train deep learning models for use cases such as natural language processing, object detection, and image recognition. With DL1 instances, you also get up to 40% better price/performance for training deep learning models compared to current generation GPU-based EC2 instances. Visit Amazon EC2 DL1 Instances to learn more about how DL1 instances can accelerate your training workloads.
About the authors
Dvij Bajpai is a Senior Product Manager at AWS. He works on developing EC2 instances for workloads in machine learning and high-performance computing.
Amr Ragab is a Principal Solutions Architect at AWS. He provides technical guidance to help customers run complex computational workloads at scale.
Shruti Koparkar is a Senior Product Marketing Manager at AWS. She helps customers explore, evaluate, and adopt EC2 accelerated computing infrastructure for their machine learning needs.