Posted On: Oct 8, 2020
Thirty-two new or updated datasets from the Massachusetts Institute of Technology, the First Street Foundation, Ookla, and others are available on the Registry of Open Data in the following categories.
COVID-19:
- Folding@home COVID19 Datasets from the Folding@home Consortium
- COVID Hiring Data: US Hiring Rates from Greenwich.HR
Life sciences:
- Genome Aggregation Database (gnomAD) and the UK Biobank Panancestry GWAS Summary Statistics from the Broad Institute
- Ohio State Cardiac MRI Raw Data from the Ohio State University
- Medical Decathlon Segmentation Datasets from the Medical Decathlon Team
- Distributed Archives for Neurophysiology Data Integration (DANDI) from the Massachusetts Institute of Technology
- Oxford Nanopore Technologies Benchmark Datasets from Oxford Nanopore Technologies
- ChEMBL 25 and 27 and Open Targets 2020-06 managed by Amazon Web Services (AWS)
- Updated: Human PanGenomics Project from the Human PanGenomics References Consortium
Geospatial:
- Low Altitude Disaster Imagery (LADI) from MIT Lincoln Lab
- National Aerial Imagery Program (NAIP) 2019 data managed by Esri
- Analysis Ready Sentinel-1 Backscatter Imagery managed by Indigo Ag
- Sentinel-2 Cloud-Optimized GeoTIFFs managed by Element 84
- S-111 Surface Water Currents Data from NOAA
- ISS SERVIR Environmental Research and Visualization System (ISERV) managed by Radiant Earth Foundation
- PoroTomo Distributed Acoustic Sensing (DAS) from National Renewable Energy Laboratory
Climate and weather:
- Ozone Monitoring Instrument (OMI) / Aura NO2 Tropospheric Column Density from NASA
- World Ocean Database from NOAA
- Global Ensemble Forecast System Re-forecasts from NOAA
- Space Weather Forecast and Observation Data from NOAA
- Coupled Model Intercomparison Project 6 managed by Pangeo
- Flood Risk Summary Statistics from First Street Foundation
- Department of Energy's Open Energy Data Initiative (OEDI) managed by National Renewable Energy Laboratory
- Weather Radar Data from the Finnish Meteorological Institute
Machine learning:
- Radiant MLHub from the Radiant Earth Foundation
- Japanese Tokenizer Dictionaries from Cotonoha
- Japanese dictionaries and word embeddings for natural language processing from Works Applications
- Automatic Speech Recognition (ASR) Error Robustness from Amazon
- Enriched Topical-Chat Dataset for Knowledge-Grounded Dialogue Systems from Amazon
Networking:
The AWS Open Data Sponsorship Program covers the cost of storage for publicly available cloud-optimized datasets. We work with data providers who seek to:
- Democratize access to data by making it available for analysis on AWS
- Develop new cloud-native techniques, formats, and tools that lower the cost of working with data
- Encourage the development of communities that benefit from access to shared datasets
Learn how to propose your dataset to the AWS Open Data Sponsorship Program
Learn more about Open Data on AWS