Free | Publicly available
This dataset contains 8,000+ brain MRIs of 2,000+ patients with brain metastases.
This program exists to help people discover and share data sets that are available by using AWS resources. Unless specifically stated in the applicable data set documentation, data sets available through the Registry of Open Data on AWS are not provided or maintained by AWS. Data sets are provided and maintained by a variety of third parties under a variety of licenses. Please check data set licenses and related documentation to determine if a data set may be used for you application. If you have a project using a listed data set please tell us about it at opendata@amazon.com.
Free | Publicly available
This dataset contains 8,000+ brain MRIs of 2,000+ patients with brain metastases.
Free | Publicly available
MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework. The MIMIC-III dataset is freely-available. Researchers seeking to use the database must formally request access. For details, see the getting started page. Once you have a PhysioNet account, you must enable access to the MIMIC-III dataset from your AWS account. To do this, please input your AWS account number
Free | Publicly available
Collection of 7 billion small molecules in SMILES notation with 28 billion fingerprints, including MACCS, ECFP4, FCFP4, and PubChem, with pre-constructed USearch indexes over them.
Free | Publicly available
Space weather forecast and observation data is collected and disseminated by NOAA’s Space Weather Prediction Center (SWPC) in Boulder, CO. SWPC produces forecasts for multiple space weather phenomenon types and the resulting impacts to Earth and human activities. A variety of products are available that provide these forecast expectations, and their respective measurements, in formats that range from detailed technical forecast discussions to NOAA Scale values to simple bulletins that give information in laymen's terms. Forecasting is the prediction of future events, based on analysis and modeling of the past and present conditions of the environment you are interested in. In Space Weather, persistence and recurrence of active regions on the sun over the 27-day solar rotational period play an important role in accurately forecasting the space environment.
Free | Publicly available
Comprehensive, large-scale single-cell profiling of healthy human blood at different ages is one of the critical pending tasks required to establish a framework for systematic understanding of human aging. Here, using single-cell RNA/TCR/BCR-seq with protein feature barcoding (20 antibodies), we profiled 317 samples from 166 healthy individuals aged 25 to 85 years old drawn over 3-year period. Dataset spanning ~2 million cells describes 50 subpopulations of blood immune cells, with 14 subpopulations changing with age, including a novel NKG2C+ CD8 Tcm population that decreases with age. We describe age-associated accumulation of Th2 and HLA-DR+ memory CD4 T cells, CCR4+ CD8 Tcm cells and GZMK+ CD8 Tem cells. We validate key findings using 30-plex spectral cytometry panel. We characterize patterns of antigen receptor clonality across subpopulations of T and B cells and describe their age-dependence. Our work provides novel insights into healthy human aging and unique annotated resou[...]
Free | Publicly available
A centralized repository of pre-formatted BLAST databases created by the National Center for Biotechnology Information (NCBI).
Free | Publicly available
This dataset captures Sunflower's genetic diversity originating from thousands of wild, cultivated, and landrace sunflower individuals distributed across North America. The data consists of raw sequences and associated botanical metadata, aligned sequences (to three different reference genomes), and sets of SNPs computed across several cohorts.
Free | Publicly available
EMBED is a racially diverse mammography dataset containing 3.4M screening and diagnostic images from 110,000 patients collected from 2013-2020, with an equal representation of black and white women. The dataset is comprised of 2D, synthetic 2D (C-view), and 3D (digital breast tomosynthesis, i.e. DBT) images. It contains 60,000 annotated lesions linked to structured imaging descriptors and ground truth pathologic outcomes grouped into six severity classes. This release represents 20% of the total 2D and C-view dataset and is available for research use. DBT, US, and MRI exams will be added at a later date. Acknowledgements - We would like to thank Glendor, Inc and MD.ai for assistance with image de-identification.
Free | Publicly available
NASA’s Space Biology Open Science Data Repository (OSDR) introduces a one-stop site where users can explore and contribute a variety of NASA open science biological data. This site consolidates data from the Ames Life Sciences Data Archive (ALSDA) and GeneLab and includes information about the broader NASA Open Science and Open Data initiatives, all at one centralized location. Our mission is to maximize the utilization of the valuable biological research resources and enable new discoveries. OSDR introduces access to data generated from spaceflight and space relevant experiments that explore the biological response of terrestrial biology through the AWS Open Data Registry page. The ALSDA is the official repository of non-human science data spanning a broad range of biological levels involving data from tissues, organs, whole organisms, physiology, and behavior. GeneLab is an open science repository hosting multiple types of ‘omics including transcriptomics, metagenomics[...]
Free | Publicly available
The IIIC dataset includes 50,697 labeled EEG samples from 2,711 patients' and 6,095 EEGs that were annotated by physician experts from 18 institutions. These samples were used to train SPaRCNet (Seizures, Periodic and Rhythmic Continuum patterns Deep Neural Network), a computer program that classifies IIIC events with an accuracy matching clinical experts.
showing 41 - 50