Posted On: Jul 15, 2020
Twenty-three new or updated Amazon Web Services (AWS) public datasets from the National Center for Bioinformatics, Johns Hopkins University, University of Texas at Southwestern, National Oceanic and Atmospheric Administration (NOAA), the National Cancer Institute, National Herbarium of New South Wales, and others are now available in the following categories:
COVID-19 response:
- COVID-19 Molecular Structure and Therapeutics Hub from the Molecular Sciences Software Institute
- COVID-19 Genome Sequence Dataset from the National Center for Biotechnology Information
Life sciences:
- Cloud Genomic Indexes from Johns Hopkins University and the University of Texas at Southwestern
- Refgenie Genomic Assets from University of Virginia
- Gabriella Miller Kids First Pediatric Research Program from the National Cancer Institute
- The Cancer Genome Atlas from the National Cancer Institute
- Basic Local Alignment Sequence Tool (BLAST) Databases from the National Library of Medicine
- National Herbarium of New South Wales from the Royal Botanic Gardens and Domain Trust
Meteorological:
- National Blend of Models from the NOAA
- National Digital Forecast Database from the National Oceanic and Atmospheric Administration
- NEXRAD Level 3 from the NOAA managed by Unidata
- Storm EVent ImageRy (SEVIR) from the Massachusetts Institute of Technology
- RAPID NRT flood maps from Eversource Energy Center, the University of Connecticut
- Tracking the Sun from the National Renewable Energy Laboratory
- US Wave dataset from the National Renewable Energy Laboratory
Geospatial:
- New Jersey Statewide Digital Aerial Imagery Catalog and LiDAR data from the New Jersey Office of Information Technology
- Crowd Sourced Bathymetry (CSB) from the NOAA
- Prefeitura Municipal de São Paulo (PMSP) LiDAR Point Cloud from GeoSampa
- Sentinel-3 from Meteorological Environmental Earth Observation
Machine learning:
- RarePlanes from CosmiQ Works
- Multilingual Amazon Reviews Corpus from Amazon
- Answer Reformulation by Alexa Shopping
- Humor Detection from Product Question Answering Systems by Alexa Shopping
The AWS Public Dataset Program covers the cost of storage for publicly available high-value cloud-optimized datasets. We work with data providers who seek to:
- Democratize access to data by making it available for analysis on AWS
- Develop new cloud-native techniques, formats, and tools that lower the cost of working with data
- Encourage the development of communities that benefit from access to shared datasets
Learn how to propose your dataset to the AWS Public Dataset Program