Posted On: Oct 16, 2018
19 new AWS Public Datasets are now available for researchers and developers interested in life sciences, environmental science, machine learning, multimedia, civic tech, and cyber security.
Life sciences:
- Tabula Muris from the Chan Zuckerberg Biohub
- Cell Painting Image Collection, GATK Test Data, and Broad Genome References from the Broad Institute
Machine learning:
- Image classification, image localization, natural language processing, and COCO datasets from fast.ai
- KITTI Vision Benchmark Suite from the Karlsruhe Institute of Technology
Environmental:
- DWD ICON Global, DWD ICON-EU, and DWD COSMO-D2 weather models from Deutscher Wetterdienst (German National Meteorological Service)
- NOAA Global Ensemble Forecast System provided through the NOAA Big Data Project
- NOAA Operational Forecast System provided through the NOAA Big Data Project
- Downscaled Climate Data for Alaska from the International Arctic Research Center, University of Alaska
Civic:
- IChangeMyCity Complaints Data from the Janaagraha Centre for Citizenship and Democracy
Cyber security:
- Forward DNS ANY Dataset from Rapid7
- A Realistic Cyber Defense Dataset from Canada's Communications Security Establishment and the Canadian Institute for Cybersecurity
Multimedia:
- Xiph.Org Test Media from Xiph.Org
The AWS Public Dataset Program covers the cost of storage for publicly available high-value cloud-optimized datasets. We work with data providers who seek to:
- Democratize access to data by making it available for analysis on AWS.
- Develop new cloud-native techniques, formats, and tools that lower the cost of working with data.
- Encourage the development of communities that benefit from access to shared datasets.
Learn how to propose your dataset to the AWS Public Dataset Program.