AWS Public Sector Blog

The Registry of Open Data on AWS surpasses 1,000 datasets

The Registry of Open Data on AWS surpasses 1,000 datasets

Unlocking Innovation: A growing community of data providers is making the world’s most impactful datasets freely available for research, discovery, and innovation

We’re excited to announce that the Registry of Open Data on Amazon Web Services (AWS) has surpassed a major milestone: 1,122 datasets are now freely available to anyone. Within the last two years the Registry of Open Data on AWS has more than doubled in size—growing from 556 to over 1,100 datasets, representing a 102% increase in datasets.

With the Registry of Open Data, our goal was straightforward: remove the barriers that prevent researchers, developers, and innovators from accessing high-value datasets. Today, that vision is thriving, powered by a global community of data providers spanning government agencies, research institutions, nonprofits, and private organizations.

What is the Registry of Open Data on AWS?

The Registry of Open Data on AWS simplifies finding and accessing datasets on AWS that are available for anyone to use. These datasets are hosted on AWS infrastructure, meaning users can analyze them in the cloud without needing to download massive files or manage their own storage. Whether you’re a climate scientist, a genomics researcher, or a machine learning engineer, the registry provides a single place to discover datasets that can accelerate your work.

A milestone worth celebrating

Reaching 1,122 datasets is more than just a number—it reflects the growing momentum of the open data movement. Here are a few highlights:

  • Diverse domains – The registry spans genomics, satellite imagery, climate and weather, natural language processing, autonomous vehicles, and much more.
  • Over 400 Petabytes of data – Collectively, these datasets represent petabytes of freely accessible information, hosted and ready for analysis.
  • Global contributors – Data providers include organizations like National Oceanic and Atmospheric Administration (NOAA), National Aeronautics and Space Administration (NASA), the Allen Institute, the National Institutes of Health (NIH), Biohub and hundreds of others committed to making data open and accessible.
  • Enabling reproducibility – By providing stable, cloud-hosted datasets, the registry helps ensure that scientific research is reproducible and that results can be independently verified.

How the AWS Open Data Program drives innovation

AWS Open Data fuels breakthroughs across industries. Here are just a few examples of what it makes possible:

  • Climate research – Scientists use open weather and satellite datasets to model climate change, predict extreme weather events, and inform policy decisions.
  • Genomics and healthcare – Researchers use datasets like the 1000 Genomes Project and CellxGene Census to advance our understanding of human biology and disease.
  • Machine learning – Developers train and benchmark AI models on open datasets, accelerating progress in computer vision, natural language processing, and beyond.
  • Public policy – Government agencies and civic organizations use open data to improve transparency, drive evidence-based decision-making, and better serve communities.

What’s next?

We’re just getting started. As the open data community continues to grow, we’re committed to improving how these datasets are discovered, accessed, and used. We’re investing in better search and discovery tools, expanding our partnerships with data providers, and working to ensure that the Registry of Open Data on AWS remains a trusted, go-to resource for open data.

If you have a dataset you want to share with the world, we’d love to hear from you. Learn more about Open Data on AWS.

Get started

Ready to explore? Browse the full catalog of datasets within the Registry of Open Data on AWS and start building today.