Posted On: Apr 11, 2019
18 new or updated AWS Public Datasets are now available in the following categories:
Astronomy:
- Epoch of Reionization Radio Astronomy Dataset from the University of Washington
- LOFAR ELAIS-N1 Cycle 2 Observations Radio Astronomy Dataset from the Institute for Astronomy, University of Edinburgh
Biology:
- ZINC15 3D Molecular Docking Models from John Irwin
- Genome Ark from the Vertebrate Genomes Project
- Encyclopedia of DNA Elements (ENCODE) Dataset from the ENCODE Data Coordinating Center
- Human PanGenomics Project from the University of California Santa Cruz
Disaster Response:
- Sentinel-1 Single Look Complex (S1 SLC) dataset for South Asia, Southeast Asia, Taiwan, and Japan from Nanyang Technological University in Singapore
- Open Earthquake Early-Warnings (OpenEEW) from Grillo
Encyclopedic:
- Software Heritage Graph Dataset from Software Heritage
Environmental:
- Wind Integration National Dataset (WIND) from the U.S. National Renewable Energy Laboratory (NREL)
- National Solar Radiation Data Base from the U.S. National Renewable Energy Laboratory (NREL)
- eBird Status and Trends Model Results from the Cornell Lab of Ornithology
- Africa Soil Information Service (AfSIS) Soil Chemistry from Quantitative Engineering Design
Machine Learning:
- The Massively Multilingual Image Dataset from the University of Pennsylvania has been expanded to include data from 98 languages.
- Paracrawl from the Broader Web-Scale Provision of Parallel Corpora for European Languages
Meteorological:
- Global Forecast System (GFS v2.0 & v3.0) from NOAA
- Météo-France Models from OpenMeteoData
Regulatory:
- IRS 990 Filings in Spreadsheets from Applied Nonprofit Research
The AWS Public Dataset Program covers the cost of storage for publicly available high-value cloud-optimized datasets. We work with data providers who seek to:
- Democratize access to data by making it available for analysis on AWS.
- Develop new cloud-native techniques, formats, and tools that lower the cost of working with data.
- Encourage the development of communities that benefit from access to shared datasets.
Modified 12/9/2021 – In an effort to ensure a great experience, expired links in this post have been updated or removed from the original post.