Listing Thumbnail

    Cervical Cancer Screening

     Info
    Deployed on AWS
    This dataset is composed of responses from 858 patients and 36 variables focusing on the prediction of indicators or diagnosis of cervical cancer. The dataset provides demographic information, habits, and historic medical records of the 858 patients from Hospital Universitario de Caracas in Caracas, Venezuela. A number of the patients did not answer some of the questions due to privacy concerns.

    Overview

    Overview

    Despite the possibility of prevention with regular cytological screening, cervical cancer is one of the significant causes of mortality in low-income countries killing more than a quarter of a million cases per year. This is because resources are very limited and patients have poor adherence to routine screening due to lack of awareness. In addition, prediction of individual patient's risk and best screening strategy during diagnosis has become a challenge with the existence of several diagnostic methods and physician's subjective preferences, usually based on expertise and comfort. Hence, prediction of cervical cancer using automated methods or computed aided diagnosis (CAD) system would require data from each source - modality and expertise.

    This study was conducted to create a predictive model of transfer learning (TL) from one source to another, such as modality to an expert, in order to accurately predict risk for cervical cancer and consequently diagnose cervical cancer among patients.


    License Information

    The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the Data Library  on AWS. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes.


    Schema

    NameDescriptionTypeConstraints
    Age_of_RespondentsA featured risk factor for cervical cancer, this represents the age of patients from Hospital Universitario de Caracas who responded to the questions on demographic information, habits, and historic medical records; some did not answer some questions due to privacyIntegerLevel: Ratio
    Number_of_Sexual_PartnersNumber of sexual partners as a featured risk factor for cervical cancerIntegerLevel: Ratio
    First_Sexual_IntercourseAge of first sexual intercourse as a featured risk factor for cervical cancerIntegerLevel: Ratio
    Number_of_PregnanciesNumber of pregnancies as a featured risk factor for cervical cancerIntegerLevel: Ratio
    Is_SmokingSmoking as a featured risk factor for cervical cancer; answers whether patient is smoking or notBoolean
    Smoking_in_YearsLength of smoking in years as featured risk factor for cervical cancerNumberLevel: Ratio
    Smoking_in_Packs_per_YearNumber of cigarette packs consumed per year of smoking as a featured risk factor for cervical cancerNumberLevel: Ratio
    Is_On_Hormonal_ContraceptivesUse of hormonal contraceptive as a featured risk factor for cervical cancer; answers whether patient is on contraceptives or notBoolean
    Hormonal_Contraceptives_in_YearsNumber of years on hormonal contraceptives as a featured risk factor for cervical cancerNumberLevel: Ratio
    Is_On_IUDUse of intrauterine device (IUD) as a featured risk factor for cervical cancer; answers whether patient is on IUDBoolean
    IUD_in_YearsNumber of years on IUD as a featured risk factor for cervical cancerNumberLevel: Ratio
    Is_Diagnosed_with_STDsPatient diagnosis of STDs as a featured risk factor for cervical cancer; answers whether patient has been diagnosed with STDBoolean
    Number_of_Years_with_STDsNumber of years with STDs acquired as featured risk factor for cervical cancerIntegerLevel: Ratio
    Is_STD_CondylomatosisIf STD is categorized as condylomatosisBoolean
    Is_STD_Cervical_CondylomatosisIf STD is categorized as cervical condylomatosisBoolean
    Is_STD_Vaginal_CondylomatosisIf STD is categorized as vaginal condylomatosisBoolean
    Is_STD_Vulvoperineal_CondylomatosisIf STD is categorized as vulvoperineal condylomatosisBoolean
    Is_STD_SyphilisIf STD is categorized as syphilisBoolean
    Is_STD_Pelvic_Inflammatory_DiseaseIIf STD is categorized as inflammatory diseaseBoolean
    Is_STD_Genital_HerpesIf STD is categorized as genital herpesBoolean
    Is_STD_Molluscum_ContagiosumIf STD is categorized as molluscum contagiosumBoolean
    Is_STD_AIDSIf STD is categorized as Acquired Immune Deficiency Syndrome (AIDS)Boolean
    Is_STD_HIVIf STD is categorized as Human Immunodeficiency Virus (HIV)Boolean
    Is_STD_Hepatitis_BIf STD is categorized as hepatitis BBoolean
    Is_STD_HPVIf STD is categorized as Human Papillomavirus (HPV)Boolean
    Number_of_STD_DiagnosisNumber of STDs diagnosed as a featured risk factor for cervical cancerIntegerLevel: Ratio
    Time_Since_First_STD_DiagnosisTime since first STD diagnosisIntegerLevel: Ratio
    Time_Since_Last_STD_DiagnosisTime since last STD diagnosisIntegerLevel: Ratio
    Is_Diagnosis_CancerIf patient is diagnosed with cancer or noBoolean
    Is_Diagnosis_CINIf patient is diagnosed with cervical intraepithelial neoplasia (CIN) or noBoolean
    Is_Diagnosis_HPVIf patient is diagnosed with human papillomavirus (HPV) or noBoolean
    Is_DiagnosedIf patient is diagnosed withBoolean
    Is_Screening_HinselmannIf screening strategy used to predict the patient's risk of cervical cancer is colposcopy using acetic acid doneBoolean
    Is_Screening_SchillerIf screening strategy used to predict the patient's risk of cervical cancer is colposcopy using Lugol iodineBoolean
    Is_Screening_CytologyIf screening used to predict the patient's risk of cervical cancer is CytologyBoolean
    Is_Screening_BiopsyIf screening used to predict the patient's risk of cervical cancer is BiopsyBoolean

    Data Engineering Overview

    We deliver high-quality data

    • Each dataset goes through 3 levels of quality review
      • 2 Manual reviews are done by domain experts
      • Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints
    • Data is normalized into one unified type system
      • All dates, unites, codes, currencies look the same
      • All null values are normalized to the same value
      • All dataset and field names are SQL and Hive compliant
    • Data and Metadata
      • Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters
      • Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated
    • Data Updates
      • Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted

    Our data is curated and enriched by domain experts

    Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts:

    • Field names, descriptions, and normalized values are chosen by people who actually understand their meaning
    • Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset
    • Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations
    • The data is always kept up to date – even when the source requires manual effort to get updates
    • Support for data subscribers is provided directly by the domain experts who curated the data sets
    • Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution.

    Need Help?


    About Us

    John Snow Labs , an AI and NLP for healthcare company, provides state-of-the-art software, models, and data to help healthcare and life science organizations build, deploy, and operate AI projects.

    Details

    Delivery method

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Cervical Cancer Screening

     Info
    This product is available free of charge. Free subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Vendor refund policy

    No refunds offered. For any questions email us at info@johnsnowlabs.com 

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    AWS Data Exchange (ADX)

    AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.

    Additional details

    Data sets (1)

     Info

    You will receive access to the following data sets.

    Data set name
    Type
    Historical revisions
    Future revisions
    Sensitive information
    Data dictionaries
    Data samples
    Cervical Cancer Screening
    All historical revisions
    All future revisions
    Not included

    Resources

    Vendor resources

    Similar products