Posted On: Apr 16, 2020
30 new or updated AWS Public Datasets from Ford, the Allen Institute, Howard Hughes Medical Institute Janelia, the National Cancer Institute, and others are now available in the following categories:
Life Sciences:
- 14 new genomic datasets are provided by the National Institutes of Health under the STRIDES Initiative
- COVID-19 Open Research Dataset (CORD-19) from Allen Institute Artificial Intelligence (AI2)
- University of British Columbia Sunflower Genome Dataset from University of British Columbia
- iHART Whole Genome Sequencing Data Set from Stanford University
- stdpopsim species resources from University of Oregon
- Variant Effect Predictor with Loss of Function Transcript Effect Estimator Plugin from Privo
- Fly Brain Anatomy: FlyLight Gen1 and Split-GAL4 Imagery from Howard Hughes Medical Institute Janelia Research Campus
- Cell Organelle Segmentation in Electron Microscopy from Howard Hughes Medical Institute Janelia Research Campus
- Allen Institute Mouse Brain Atlas from the Allen Institute for Brain Science
- FastMRI from NYU Langone Center
Geospatial
- Geosnap Neighborhood Analysis Datasets from UCR
- National Agriculture Imagery Program (NAIP) has been updated with the latest available imagery
Machine Learning
- Ford Multi-AV Seasonal Dataset from the Ford Motor Company
Sustainability
- Multi-scale Ultra-high Resolution (MUR) Sea Surface Temperature (SST) Analysis from the Farallon Institute
- Water-Column Sonar Data Archive from NOAA
- Himawari-8 managed by NOAA
The AWS Public Dataset Program covers the cost of storage for publicly available high-value cloud-optimized datasets. We work with data providers who seek to:
- Democratize access to data by making it available for analysis on AWS
- Develop new cloud-native techniques, formats, and tools that lower the cost of working with data.
- Encourage the development of communities that benefit from access to shared datasets.
Learn how to propose your dataset to the AWS Public Dataset Program.