AWS Public Sector Blog
Tag: datasets
Alzheimer’s disease research portal enables data sharing and scientific discovery at scale
The National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS DSS), powered by AWS, is a genomic database that provides access to publicly available datasets for Alzheimer’s disease and related neuropathologies. Created to make Alzheimers-genetics knowledge more accessible to researchers, NIAGADS has genomics data on 172,701 samples from 98 datasets and is now 1.3 petabytes (PB) in total size. NIAGADS is creating a system that promotes scientific discovery through data sharing with a large cadre of institutions.
Largest metastatic cancer dataset now available at no cost to researchers worldwide
The NYUMets team, led by Dr. Eric Oermann at NYU Langone Medical Center, is collaborating with AWS Open Data, NVIDIA, and Medical Open Network for Artificial Intelligence (MONAI), to develop an open science approach to support researchers to help as many patients with metastatic cancer as possible. With support from the AWS Open Data Sponsorship Program, the NYUMets: Brain dataset is now openly available at no cost to researchers around the world.
How JDRF uses AWS to power Type 1 diabetes research
Advances in technology are transforming the way health research can be conducted. It is now possible to integrate data from siloed sources into a data lake, a central repository where health data are aggregated and analyzed at scale. Now, more than ever, there are opportunities for collaborative research to accelerate life-saving medical innovation – and that’s exactly what JDRF International, the leading global Type 1 Diabetes research and advocacy organization, is doing with AWS.
Creating access control mechanisms for highly distributed datasets
Security is priority number one at AWS. Data stored in Amazon Simple Storage Service (Amazon S3) is private by default. However, some datasets are made to be shared. In this blog post, we cover the no-cost mechanisms data providers can utilize to create access control policies for their highly distributed open datasets.
AWS hosts new open dataset to help businesses identify climate finance risks and investments
Companies and asset managers looking to protect their financial investments from climate change-related risks, and invest in more sustainable solutions, can now access a new dataset on the Amazon Web Services (AWS) Cloud to help inform their decision making. Amazon announced that the Legal Entity Identifier (LEI) dataset is now available and free for anyone to access in the cloud. The dataset includes key reference information that supports clear and unique identification of legal entities participating in financial transactions, and each LEI contains information about an entity’s ownership structure, including ‘who is who’ and ‘who owns whom’.
Celebrate Open Science Week with the Allen Institute and available open datasets
The Allen Institute seeks to understand how our brains, cells, and immune systems work when we are healthy and, ultimately, how they go wrong in disease. Allen researchers have generated and shared atlases that map the brain, gene-edited stem cell lines, and many more resources that have been used by millions of scientists around the world to accelerate their research. In collaboration with AWS and the Registry of Open Data on AWS, they make many of their datasets publicly available. In celebration of Open Science Week, check out some of these open datasets from the Allen Institute, and their impact on the research community.
Satellite imagery over Africa, a large-scale climate ensemble, and product listings with 3D renderings: The latest open data on AWS
The AWS Open Data Sponsorship Program makes high-value, cloud-optimized datasets publicly available on AWS. This quarter, we released 44 new or updated datasets including satellite imagery over Africa, a large-scale climate ensemble, and product listings with 3D renderings. Learn how you can put these open datasets to work.
How the cloud is helping remove barriers to addressing climate change
What if we were to democratize access to data and compute so that anyone, anywhere in the world could contribute to climate science? The Amazon Sustainability Data Initiative (ASDI) seeks to accelerate sustainability research and innovation by minimizing the cost and time required to acquire and analyze large sustainability datasets. ASDI supports innovators and researchers with the data, tools, and technical expertise they need to advance sustainability initiatives. ASDI is committed to making climate-relevant data easier to access and analyze. ASDI’s growing data catalog comprises petabytes of open data.
Improving our knowledge about the oceans by providing cloud-based access to large datasets
As a physical oceanographer focused on remote sensing, Dr. Chelle Gentemann, senior scientist at Farallon Institute, has worked for over 20 years on retrievals of ocean temperature from space. She uses measurements of sea surface temperature from satellites to understand how the ocean impacts our lives. Chelle’s work requires analysis of large volumes of data, which requires access to large data storage and computational resources. Although most large research institutions can secure those IT resources, that is not the case for smaller organizations or underserved communities around the world. As part of the Amazon Sustainability Data Initiative, we invited Dr. Gentemann to share her perspective on the value of hosting high-resolution climate data on AWS.
The COVID-19 Healthcare Coalition: Collaborating to save lives
The COVID-19 Healthcare Coalition is a private-sector led response to COVID-19 that brings together healthcare organizations, technology firms, nonprofits, academia, and startups to preserve the healthcare delivery system and help protect U.S. populations. AWS is a member of the COVID-19 Healthcare Coalition, working side by side with leading healthcare providers and researchers such as the Mayo Clinic, Johns Hopkins, and ZocDoc. Together, we’re coordinating our collective expertise, capabilities, and data to provide insights to improve clinical outcomes.