Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Sign in
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Phrase Clustering Dataset (PCD)

Provided by: Amazon, part of the Open Data

Phrase Clustering Dataset (PCD)

Provided by: Amazon, part of the Open Data

This product is part of the Open Data and contains data sets that are publicly available for anyone to access and use. No subscription is required. Unless specifically stated in the applicable data set documentation, data sets available through the Open Data are not provided and maintained by AWS.

Description

This dataset is part of the paper "McPhraSy: Multi-Context Phrase Similarity and Clustering" by DN Cohen et al (2022). The purpose of PCD is to evaluate the quality of semantic-based clustering of noun phrases. The phrases were collected from the Amazon Review Dataset .

License

This data is available for anyone to use under the terms of the CDLA-permissive license, which is available here 

How to cite

Phrase Clustering Dataset (PCD) was accessed on DATE from https://registry.opendata.aws/pcd .

Update frequency
Not updated
Support information

Managed by: Amazon

Contact: Post any questions to re:Post  and use the AWS Open Data tag.

General AWS Data Exchange support

Resources on AWS

Description

Phsrase Clustering Dataset (PCD)

Resource type
S3 Bucket
Amazon Resource Name (ARN)
arn:aws:s3:::amazon-phrase-clustering
AWS Region
us-west-2

AWS CLI Access (No AWS account required)

aws s3 ls --no-sign-request s3://amazon-phrase-clustering/

Usage examples

Publications