Bridging research and HPC to tackle grand challenges

Bridging Research and HPC to tackle grand challengesToday we announced the AWS Impact Computing Project at the Harvard Data Science Initiative (HDSI) to identify potential solutions that can improve the lives of humans, other species, and natural ecosystems.

Technology and innovation are transforming the world at an unimaginable pace – changing society and economies, curing disease, and fundamentally re-shaping the way we live.

And yet, much remains to be done to combat global inequities, address climate change, ensure food security, and better anticipate global health crises. Thanks to advances in science and technology, we now deeply understand the velocity with which infectious diseases and pandemics can spread – supercharged because of climate change and globalization. For example, we know that a pathogen can travel from a remote village to major cities on all continents in 36 hours.

These are grand challenges. They are complex, highly interdependent, and dynamic. The solutions to these challenges must integrate many things from science, engineering, and technology, with policy, culture, and geopolitics.

What is this collaboration about?

I have spent my entire career thinking about the role HPC, and now AI, can play in solving the most vexing problems facing us. For the first time, because of the scale and capability of AWS, I can see a path towards meaningful progress. Scale matters – but understanding the depth and complexity of the use-cases matters too. Tapping into the brightest minds matters – but, asking the right questions, perhaps matters the most.

That’s why I’m thrilled about our collaboration with HDSI. The basic premise of the initiative posits that fundamental gaps in understanding the problem space, coupled with lack of accessible computational power and algorithms, have stifled progress.

It’s easy to think the solution is merely more compute. Unfortunately, it’s not. Important challenges in science, analytic methodologies, data, and accessibility must be addressed along the way.

Together, AWS and HDSI will engage in deep, cross-disciplinary data-science research to strengthen and expand our understanding of the problem space. We’ll leverage those insights to optimize and enhance our HPC and AI service portfolio to better support these unmet needs.

This is a classic example of what Amazonians do daily – working backwards from the needs of our customers. In this case, the requirements are coming from organizations working on large social and global challenges. Each of these areas, from climate change, sustainability, and food security, to drug discovery for orphan diseases and ensuring equitable healthcare, requires access to large shared data sets and easy access to significant computational power. Getting these resources into the hands of decision makers is vital. Our goal is to make data and analysis accessible to anyone, from field workers to policymakers, enabling them to deepen their understanding of the issues and make more informed decisions.

This collaboration will open a whole new area of impact-specific solutions and build the capacity for sustainable change.

How does the collaboration work?

One of the early projects we are exploring is an effort to predict — with reasonable accuracy — the maize yield in Africa in the context of climate change, driven by extreme heat waves. Fundamental gaps exist around this problem ranging from lack of high-resolution geospatial data to estimate land use, and algorithms and methodologies to combine land use and historical weather and climate data to predict crop yield.

To help with this, we plan to bring together the research methodologies from HDSI, the massive historical climate data from the UK Met Office, and geo-spatial data on land use from the Amazon Sustainability Data Initiative (ASDI) to refine and optimize the crop yield predictions.

These models could potentially be used by organizations like the World Food Program or the African Development Bank to plan effective response. A key focus will be to ensure the tools and methods we develop are easily accessible and usable by the people who need this local data to make decisions daily. From a technology infrastructure perspective, this project will involve developing data platforms to integrate large and heterogeneous datasets for climate, pollution, and environmental observations with high performance computing infrastructure so we can model and simulate future state scenarios based on the research algorithms.

While this example provides a glimpse into possible outcomes and associated impact, there’s also an equally important technology innovation aspect to this. Technology is a key part of most research projects, especially the ones based on analyzing and making sense of huge data sets, or others that require complex multi-variate analyses to infer the impact of hundreds of variables on a single event. The dependence of technology is so prevalent in these research projects that researchers may limit their analysis, simulation and modeling and investigations based on the availability of technology, either in terms of capacity or capability. In either case, the net effect turns technology infrastructure into a blocker to furthering scientific progress as opposed to a catalyst.

Necessity is the mother of invention

Every time a researcher is blocked by lack of capability, we see an opportunity to innovate.

Imagine the complexity of creating a digital twin of the earth, or a true simulation of the global economy considering the tens of thousands of variables that can affect these models. We fully expect these large, complex problems to stretch aspects of our existing products and services. Some of these will push us to develop new features or, in some cases, design and develop new services. Thinking big drives us.

Every time a researcher limits their investigation because they do not have access to data, or infrastructure capacity, or the economic resources to procure expensive computing equipment, there’s an opportunity to use the power of the cloud. High-performance computing in the cloud democratizes access to data and infrastructure capacity.

Putting powerful tools in the hands of the right people

Getting the right data, at the right level of detail, to the right people that need to use that data is critical to bending the curve when it comes to large problems like food security. Ample access to the right computing infrastructure means that you can offer weather, soil moisture, or other geo-spatial data to maize farmers at a per-farm level, instead of an aggregate scale for the district or region.

We know that success and scale bring broad responsibility. AWS and HDSI both know that what we do impacts the world. This collaboration is about leaving things better than we found them.

Debra Goldfarb

Debra Goldfarb

Debra is Director of HPC Products and Strategy for AWS, leading the company’s efforts to broaden the scope and portfolio of HPC services and capabilities. Prior to AWS, she was an Intel Fellow, and Chief Analyst for Intel’s Data Platform Group’s Strategy Organization. Previously, she spearheaded Intel’s efforts in expanding the use of HPC into new markets, and was instrumental in defining strategy and pathfinding for Intel’s Technical Computing Group. Prior to Intel, Debra led strategy and evangelism for Microsoft’s Technical Computing organization, driving several high-profile collaborations including the Gates Foundation’s Malaria initiative. Debra started her career as an industry analyst at IDC, launching its High-Performance Computing (HPC) practice, founding the HPC User Forum, its Life Sciences practice and launching the Bio-IT business portfolio. Her work as a leader in HPC helped shaped US technology policy and the global HPC ecosystem. Outside of AWS, she is actively involved in diversity and STEM initiatives.