AWS Public Sector Blog

82 new or updated datasets available on the Registry of Open Data on AWS

AWS branded background with text "82 new or updated datasets available on the Registry of Open Data on AWS"

The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on Amazon Web Services (AWS). AWS works with data providers to democratize access to data by making it available to the public for analysis on AWS; develop new cloud-based techniques, formats, and tools that lower the cost of working with data; and encourage the development of communities that benefit from access to shared datasets. Through the AWS Open Data Sponsorship Program, customers are making over 300 PB of high-value, cloud-optimized data available for public use.

All publicly available datasets can be found in the Registry of Open Data on AWS and are now also discoverable on Exchange. This quarter, AWS released 82 new or updated datasets.

What are people currently doing with AWS Open Data?

  • Amazon employees are revolutionizing earth observation with geospatial foundation models on AWS utilizing open data. In this post, we explore how Clay Foundation’s Clay foundation model, available on Hugging Face, can be deployed for large-scale inference and fine-tuning on Amazon SageMaker.
  • A tutorial on how to use life sciences data from AWS Open Data program in Amazon Bedrock. A look at how to use datasets in the Registry of Open Data on AWS with Amazon Bedrock Knowledge Bases. With Amazon Bedrock Knowledge Bases, you can give foundation models (FMs) and agents contextual information from private and public data sources to deliver more relevant, accurate, and customized responses.
  • The AWS Open Data team hosted an Open Data Life Sciences Hackathon from October 1-3, 2025 at Amazon HQ2 in Arlington, Virgina. This was an in-person only, 3-day hackathon for researchers interested in building knowledge graphs using large publicly available life sciences datasets from AWS Open Data.
  • The POWER Project from NASA provides direct access to its complete datastore in Amazon Simple Storage Service (Amazon S3) buckets in cloud-optimized formats. This datastore and associated access is provided by AWS’s Registry of Open Data and is accessible free of charge to everyone.
  • E11 Bio released a new brain tissue dataset (E11bio PRISM) within the Registry of Open Data on AWS. This new dataset is a key first demonstration of a novel technology that will increase our ability to trace neurons and their connections through complicated brain tissue. This will allow neuroscientists to better understand the wiring of mammalian brains and ultimately revolutionize neuroscience and the treatment of neurological diseases.
  • Interactive access and visualization of geospatial data from the AWS Open Data Program. Access to high-quality geospatial data is no longer limited to technical experts with large computing resources. Thanks to collaborations between open data initiatives such as AWS Open Data, Amazon Sustainability Data Initiative (ASDI), and the Maxar Open Data program, coupled with intuitive tools such as Leafmap and Solara, anyone can explore and visualize critical Earth data in minutes.

What will you build with these datasets?

E11 Bio PRISM

We are excited to announce the release of E11 Bio’s brain tissue dataset on AWS as part of the Registry of Open Data on AWS (E11bio PRISM). This novel dataset from E11 Bio, a nonprofit Convergent Research Focused Research Organization (FRO) in collaboration with the Francis Crick InstituteMassachusetts Institute of Technology (MIT), and the Max Planck Institute contains light microscopy images and the traced paths of individual neurons. The publication of this dataset is a key first demonstration of a novel technology that will increase our ability to trace neurons and their connections through complicated brain tissue. This will allow neuroscientists to better understand the wiring of mammalian brains and ultimately revolutionize neuroscience and the treatment of neurological diseases.

E11 Bio joins 81 other new or updated datasets on the Registry of Open Data in the following categories.

Climate and weather

Geospatial

Life sciences

Machine learning

How can you make your data available?

Looking to make your data available? The AWS Open Data Sponsorship Program covers the cost of storage for publicly available high-value, cloud-optimized datasets. We work with data providers who seek to:

  • Democratize access to data by making it available for analysis on AWS
  • Develop new cloud-native techniques, formats, and tools that lower the cost of working with data
  • Encourage the development of communities that benefit from access to shared datasets

Learn how to propose your dataset to the AWS Open Data Sponsorship Program.

Learn more about open data on AWS.