AWS Public Sector Blog

Purdue University democratizes geospatial data through AWS Open Data Sponsorship Program

Purdue University democratizes geospatial data through AWS Open Data Sponsorship Program
Research depends on access to vast amounts of data and the right tools to analyze it, but the sheer scale of geospatial information produced by researchers can create significant barriers to cross-disciplinary collaboration. Addressing this challenge is the mission of Purdue University’s Data to Science Initiative (D2S). With this program, researchers across disciplines can share and access a unified collection of geospatial datasets from around the world.

D2S is pleased to announce acceptance into the Amazon Web Services (AWS) Open Data Sponsorship Program. The D2S collection of geospatial data is now available on the Registry of Open Data on AWS. This collaboration represents a significant step forward in democratizing access to geospatial data for researchers worldwide, facilitating breakthrough discoveries in agriculture, forestry, transportation, and beyond.

The AWS Open Data Sponsorship Program covers the cost of storage for publicly available high-value cloud-optimized datasets as well as the data transfer costs for end users accessing the data.

Strengthening research communities through education

This AWS collaboration with Purdue extends beyond data hosting. AWS recently participated in and helped sponsor Purdue Geographic Information Systems (GIS) Day 2025: Unlocking GeoAI Data and Tools, where we presented to faculty, students, and researchers about the value of cloud technology in the geospatial space. The presentation showcased how AWS services and programs—including AWS Open Data—support the democratization of data, as well as highlighted the various ways AWS users can ingest, process, and analyze geospatial data efficiently.

Storage and access scalability through AWS

“AWS is the industry leader when it comes to public cloud services, providing the most comprehensive and reliable cloud services worldwide,” said Jinha Jung, associate professor at the Lyles School of Civil and Construction Engineering at Purdue.

“If we want to develop an ecosystem that is scalable and accessible worldwide, AWS will be the best choice in my opinion. We are deeply grateful to be accepted into the AWS Open Data Sponsorship Program, the result of a competitive selection process whose notable awardees include the EPA, NASA, NOAA, USGS, and the Indiana Geographic Information Office.”

At the time of writing, data on D2S is hosted on servers managed by Purdue, which has sufficient computational resources to provide D2S services at the moment. As the D2S user community grows, Jung envisions that the on-premises services might reach a tipping point where more scalable computing infrastructure is required.

“This is where AWS Open Data program will be very valuable,” he said. “AWS works with data providers to democratize access to data by making it available for analysis on AWS; develop new cloud-native techniques, formats, and tools that lower the cost of working with data; and encourage the development of communities that benefit from access to shared datasets. Through the program, AWS has democratized access to petabytes of data, including satellite imagery, climate and weather data, genomic data, and data used for natural language processing. The full list of publicly available datasets is available on the Registry of Open Data on AWS.”

What makes D2S unique

D2S initially focused on unmanned aerial vehicle (UAV) data for crops and forestry. UAV data and other Earth observation data are becoming increasingly important across disciplines such as forestry, agriculture, transportation, and environmental science, meaning researchers can assess crop health status, monitor forest canopy changes, analyze urban development patterns, and track environmental impacts with unprecedented spatial and temporal resolution. Unlike general data-sharing services, D2S is specifically designed to manage and share UAV data, distinguishing it from other offerings.

At the time of writing, D2S hosts critical datasets including the United States Department of Agriculture (USDA) Wheat Coordinated Agricultural Project (WheatCAP), which contains UAV data from 41 wheat breeders and researchers across 22 institutions in 20 states. DS2 also serves the Tippecanoe County Sheriff’s Office for crash scene analysis, Purdue’s Agriculture and Natural Resources Extension, and urban 3D mapping projects at the Institute for Digital Forestry at Purdue University.

Aligning with national priorities and AI innovation

D2S aligns with White House Office of Science and Technology Policy mandates on openness in scientific enterprise, which dictates that federally funded research and supporting data are disclosed to the public at no cost. As Jung notes, “Advances in artificial intelligence are based on enormous amounts of training data. These large-scale, high-quality D2S datasets hold enormous potential to help unlock new AI-powered frontiers in many disciplines.”

This collaboration exemplifies how cloud technology can transform research infrastructure, moving from individual efforts to a collective, community-driven approach. This is perfectly aligned with the Purdue Computes initiative, which focuses on the intersection of AI, computing, and the physical world.

Join the AWS Open Data Sponsorship Program community

The AWS Open Data Sponsorship Program has democratized access to over 300 petabytes of data, including satellite imagery, climate and weather data, genomic data, and natural language processing datasets. We invite you to explore the full Registry of Open Data on AWS and discover how open data can accelerate your research.

Ready to democratize your data? Learn more about the AWS Open Data Sponsorship Program and how we can help you share your datasets with the global research community.

David Conklin

David Conklin

David is a Solutions Architect who joined Amazon Web Services (AWS) in 2022. He is passionate about Geographic Information Systems (GIS), Cloud Geospatial, and Data Engineering. Outside of work, David enjoys traveling, playing soccer, and learning about the latest technology trends.

Chris Stoner

Chris Stoner

Chris is the open environmental and geospatial data lead for the AWS Open Data team. Chris was previously the lead product manager for AWS Ground Station, developing “antennas as a service” for space customers. Chris also worked as a NASA contractor at the Alaska Satellite Facility (ASF) Distributed Active Archive Center (DAAC), developing architectures for Sentinel-1 and NISAR missions in the cloud. Chris has an MBA from the University of Massachusetts – Amherst and a bachelor’s degree in IT from the University of Massachusetts – Lowell. Chris is a published author of technical journal articles and holds several patents.

Brian DeKemper

Brian DeKemper

Brian is an enterprise account executive at AWS, supporting research university customers in the Great Lakes region. He’s spent over two decades working in technology and services companies that focus on the higher education market. Outside of work, he enjoys skiing and traveling with his family.