Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Sign in
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

REDASA COVID-19 Open Data

Provided by: REDASA Consortium, Imperial College London, UK, part of the AWS Open Data Sponsorship Program

REDASA COVID-19 Open Data

Provided by: REDASA Consortium, Imperial College London, UK, part of the AWS Open Data Sponsorship Program

This product is part of the AWS Open Data Sponsorship Program and contains data sets that are publicly available for anyone to access and use. No subscription is required. Unless specifically stated in the applicable data set documentation, data sets available through the AWS Open Data Sponsorship Program are not provided and maintained by AWS.

Description

The REaltime DAta Synthesis and Analysis (REDASA) COVID-19 snapshot contains the output of the curation protocol produced by our curator community. A detailed description can be found in our paper . The first S3 bucket listed in Resources contains a large collection of medical documents in text format extracted from the CORD-19 dataset , plus other sources deemed relevant by the REDASA consortium. The second S3 bucket contains a series of documents surfaced by Amazon Kendra that were considered relevant for each medical question asked. The final S3 bucket contains the GroundTruth annotations created by our curator community.

License

CC-BY-4.0

How to cite

REDASA COVID-19 Open Data was accessed on DATE from https://registry.opendata.aws/redasa-covid-data .

Update frequency
Yearly updates
Support information

Managed by: REDASA Consortium, Imperial College London, UK

Contact: redasa-open-data@imperial.ac.uk

General AWS Data Exchange support

Resources on AWS

Description

This is the raw data repository containing a common crawl of CORD-19 papers and other sources identified by the REDASA Project.

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::pansurg-curation-raw-open-data
AWS Region
eu-west-2

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://pansurg-curation-raw-open-data/
Description

For all the questions curated during the REDASA project, we created a Kendra index. The documents available in this S3 bucket were surfaced by the Kendra index as being relevant to the research medical question.

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::pansurg-curation-workflo-kendraqueryresults50d0eb-open-data
AWS Region
eu-west-2

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://pansurg-curation-workflo-kendraqueryresults50d0eb-open-data/
Description

An S3 bucket that contains the final curation data in GroundTruth format

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::pansurg-curation-final-curations-open-data
AWS Region
eu-west-2

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://pansurg-curation-final-curations-open-data/