AWS Public Sector Blog

Tag: dataset

36 new or updated datasets on the Registry of Open Data: AI analysis-ready datasets and more

36 new or updated datasets on the Registry of Open Data: AI analysis-ready datasets and more

This quarter, AWS released 36 new or updated datasets. As July 16 is Artificial Intelligence (AI) Appreciation Day, the AWS Open Data team is highlighting three unique datasets that are analysis-ready for AI. What will you build with these datasets?

Alzheimer’s disease research portal enables data sharing and scientific discovery at scale

The National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS DSS), powered by AWS, is a genomic database that provides access to publicly available datasets for Alzheimer’s disease and related neuropathologies. Created to make Alzheimers-genetics knowledge more accessible to researchers, NIAGADS has genomics data on 172,701 samples from 98 datasets and is now 1.3 petabytes (PB) in total size. NIAGADS is creating a system that promotes scientific discovery through data sharing with a large cadre of institutions.

Creating access control mechanisms for highly distributed datasets

Security is priority number one at AWS. Data stored in Amazon Simple Storage Service (Amazon S3) is private by default. However, some datasets are made to be shared. In this blog post, we cover the no-cost mechanisms data providers can utilize to create access control policies for their highly distributed open datasets.

Visualizing donor data with Amazon QuickSight

Data is an invaluable asset in the world of nonprofits. In this blog post, we offer a technical walkthrough to learn how nonprofits of all sizes can use Amazon QuickSight to quickly create interactive dashboards with the help of machine learning, providing a self-service way to effectively consume and analyze data without writing any code or having to worry about infrastructure.

The ERA5 Reanalysis Dataset Provides a Sharper View on Past Weather

Reanalysis is the term for using modern-day technology to analyze weather data from the past. By doing so, meteorologists and climatologists can produce a more accurate analysis of previous weather conditions, which is important for climate change research. The European Centre of Medium Range Forecasts (ECMWF) is producing its latest reanalysis dataset, called ERA5. Recently, Chris Kalima and his team at Intertrust, in conjunction with the AWS Public Datasets Program, have been working to bring the ERA5 data to AWS.

Hubble Space Imagery on AWS: 28 Years of Data Now Available in the Cloud

Since going live in 1990, the Hubble Space Telescope has delivered groundbreaking images to broaden our understanding of the universe. Each image captured by the telescope is archived and made publicly available, free of cost, by NASA through the Space Telescope Science Institute (STScI). The Hubble images archive is used by a global community of astronomers, researchers, and engineers and has led to the discovery of distant galaxies and nebulae. “The legacy is a treasure trove of data that can be mined in the future,” Arfon Smith, head of data science at STScI, said.

How the Nonprofit Open Data Collective Came Together to Work on IRS 990 Data in the Cloud

Form 990 is used by the United States Internal Revenue Service (IRS) to gather financial information about nonprofit organizations. In July 2016, the IRS started making electronic IRS 990 filing data available via the AWS Public Datasets Program. By making electronic 990 filing data available in this way, the IRS made it possible for anyone […]

Announcing USAspending.gov on an Amazon RDS Snapshot

The Digital Accountability and Transparency Act of 2014 (DATA Act) aims to make government agency spending more transparent to citizens by making financial data easily accessible and by establishing common standards for the data all government agencies collect and share on the government website, USAspending.gov. We are pleased to announce that the USAspending.gov database is now […]

LiDAR Data for Washington DC is Available as an AWS Public Dataset

LiDAR point cloud data for Washington, DC, is available for anyone to use on Amazon Simple Storage Service (Amazon S3). This dataset, managed by the District of Columbia’s Office of the Chief Technology Officer (OCTO), with the direction of OCTO’s Geographic Information System (GIS) program, contains tiled point cloud data for the entire District along […]