AWS Training and Certification Blog
How a trio of women triumphed in tech hackathon
Today the world celebrates International Women’s Day, which is dedicated to honoring the achievements of women worldwide and advocating for gender equality. This is the perfect opportunity to share a story that truly embodies the spirit of this significant occasion. This blog shares how, in the traditionally male-dominated tech industry, a trio of women embarked on a remarkable journey at the recent Tech to the Rescue hackathon. We hope it inspires women in tech to get involved initiatives that help them lend their time and talents to causes that allow them to make a positive impact in their community, and the world – like those offered through Tech to the Rescue.
About the hackathon
Tech to the Rescue helps tech companies make an impact by matching them with ambitious nonprofits to boost their causes with technology. This past December, Tech to the Rescue collaborated with AWS to run the Air Quality Hackathon. The hackathon challenged participants to create cutting-edge technology to combat air pollution, the world’s “silent killer.” The event attracted 240 engineers and innovators, supported by more than 50 tech mentors, from five continents and 27 countries. Together, they built 33 solutions for seven air pollution challenges, with $30,000 dedicated to support and donations.
Our motivation
I’m part of the Psychometrics and Data Science group within AWS Certification, and my job is to create valid and reliable AWS Certification exams for thousands of AWS certification test-takers.
In mid-November 2023, I learned about the Air Quality Hackathon and I felt a strong desire to participate. Soon after, I registered for the event as the captain of the ‘Psychometrician’ team with two colleagues, Vinita Talreja and Ye (Cheryl) Ma. None of us had ever participated in a hackathon before, but we were excited to give it a try!
December is the busiest time of year for our team and I vividly remember asking my manager, Anjali Weber, for permission to devote two-and-a-half days to this event. She expressed concerns about our end-of-year deliverables and workload. Moreover, psychometrics is a distinct field with minimal overlap to the typical work of developers, data engineers, and data scientists at tech companies, who are more accustomed to the types of challenges this event presented. This implies that our chances of delivering a successful solution were probably slim. But inspired by the AWS leadership principle of ‘learn and be curious’, I assured my manager, “We are not going to win, but this is a great opportunity for the team to think outside the box, engage with a larger community, and learn things that could potentially impact a broader audience.” She responded, “You are all superwomen. You have my full support!”
The challenge
We selected the seventh challenge, presented by the Centre for Research on Energy and Clean Air (CREA) in Poland. Our task was to create a supervised machine learning (ML) model that could accurately predict Nitrogen oxides (NOx) emissions (provided by CREA as the ground truth values) from 11 power plants in Taiwan. The possible predictor values are embedded in 14,877 satellite images collected between March 1, 2019, and September 30, 2023. Each of the images spanned 39,042 longitudinal and latitude coordinates—11 of them correspond to the power plant locations CREA is monitoring. In total, they amount to 65,395,350 data points.
Modeling such a vast dataset presented considerable challenges, further compounded by the fact that for any given plant on any day within this four-year window, predictor values might be missing from the satellite data due to weather conditions. This introduced an additional challenge to data mining, the crucial and most time-consuming step in developing an ML model.
The technical journey
Facing this daunting data challenge, we delineated the analysis strategy. We first calculated the distance between each power plant and the grid coordinates using the Haversine formula. We then used the distance to filter out relevant data within a 50-mile radius of the plants. This approach not only dramatically reduced the size of the data used in the modeling from 65,395,350 to 106,840, but also creatively and reasonably inputted the missing predictor values for any day at any of those 11 power plants.
Data scaling and transformation is another critical step in building a successful ML model. We thoroughly examined the distribution of possible predictors, evaluated their relationships with NOx emission values we needed to predict, and selected different transformation techniques for different predictors that yielded distributions closest to approximately normal.
Finally, instead of training a single ML model, we opted to utilize an AutoML (a Python package developed by Amazonians) approach with Amazon SageMaker to determine the relationships between NOx emission values and predictor variables. With all these steps, our AutoML model achieved an R-squared value of 0.9079, indicating highly promising results. Being rigorously trained as researchers, even under an extremely stringent timeline, we managed to document all steps of this project and recorded a 20-minute demo to help hackathon judges and mentors understand the rationales behind the process and evaluate the outcome. After more than 36 hours of racing against the clock, we submitted our results and waited with a quiet sense of hope.
Beyond the competition
Around 7 a.m. EST in the last day of the hackathon, we received the notification that our solution excelled in the competition—we won first place in one of the seven challenges!
After a 30-minute Q&A session with the judges and mentors in the morning, this victory advanced us to the final round, where we were selected among the top three finalists and awarded $1,000.
Although we did not secure the top prize in the end, we were told that it was a tough call and the judges were impressed with our prototype’s ability to accurately estimate NOx pollution from power plants, highlighting its promise and potential impact.
A creative win
While still excited about the outcome, we received another happy surprise: our team also triumphed in the team photo competition. During the 36-hour intensive coding and documenting session, we squeezed in the time and used an AI tool to create a photo depicting us as three happy lady coders with our lovely pets. This win earned us another $1,000 prize, which we donated entirely to our task sponsor, CREA.
As victors, we were also allowed to adopt an animal through WWF UK and we all picked penguins. Winning this competition showcases a unique angle that women in tech can bring to the community—blending creativity with compassion and deep empathy for the world around us.
Inspiring women in tech
Our journey through the Air Quality Hackathon serves as a powerful testament to the impact that women in tech can have, not only within the industry but also in addressing societal and environmental challenges. We went into the event with excitement but unsure how far our skills might take us. But we were willing to be good students and stay open and curious throughout the process. Our diversity, collaboration, and deep analytical thinking led us to success and a high sense of pride at the outcome of our hard work.
We hope our experience is a beacon to inspire women in technology to contribute their skills to causes that matter. Your talent and ideas make a difference! As we continue to break barriers and support one another, we hope you’ll join us in using technology for the greater good. Together, we have the power to change the world!