
Overview
Overview
Despite the possibility of prevention with regular cytological screening, cervical cancer is one of the significant causes of mortality in low-income countries killing more than a quarter of a million cases per year. This is because resources are very limited and patients have poor adherence to routine screening due to lack of awareness. In addition, prediction of individual patient's risk and best screening strategy during diagnosis has become a challenge with the existence of several diagnostic methods and physician's subjective preferences, usually based on expertise and comfort. Hence, prediction of cervical cancer using automated methods or computed aided diagnosis (CAD) system would require data from each source - modality and expertise.
This study was conducted to create a predictive model of transfer learning (TL) from one source to another, such as modality to an expert, in order to accurately predict risk for cervical cancer and consequently diagnose cervical cancer among patients.
License Information
The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the Data Library on AWS. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes.
Schema
| Name | Description | Type | Constraints |
|---|---|---|---|
| Age_of_Respondents | A featured risk factor for cervical cancer, this represents the age of patients from Hospital Universitario de Caracas who responded to the questions on demographic information, habits, and historic medical records; some did not answer some questions due to privacy | Integer | Level: Ratio |
| Number_of_Sexual_Partners | Number of sexual partners as a featured risk factor for cervical cancer | Integer | Level: Ratio |
| First_Sexual_Intercourse | Age of first sexual intercourse as a featured risk factor for cervical cancer | Integer | Level: Ratio |
| Number_of_Pregnancies | Number of pregnancies as a featured risk factor for cervical cancer | Integer | Level: Ratio |
| Is_Smoking | Smoking as a featured risk factor for cervical cancer; answers whether patient is smoking or not | Boolean | |
| Smoking_in_Years | Length of smoking in years as featured risk factor for cervical cancer | Number | Level: Ratio |
| Smoking_in_Packs_per_Year | Number of cigarette packs consumed per year of smoking as a featured risk factor for cervical cancer | Number | Level: Ratio |
| Is_On_Hormonal_Contraceptives | Use of hormonal contraceptive as a featured risk factor for cervical cancer; answers whether patient is on contraceptives or not | Boolean | |
| Hormonal_Contraceptives_in_Years | Number of years on hormonal contraceptives as a featured risk factor for cervical cancer | Number | Level: Ratio |
| Is_On_IUD | Use of intrauterine device (IUD) as a featured risk factor for cervical cancer; answers whether patient is on IUD | Boolean | |
| IUD_in_Years | Number of years on IUD as a featured risk factor for cervical cancer | Number | Level: Ratio |
| Is_Diagnosed_with_STDs | Patient diagnosis of STDs as a featured risk factor for cervical cancer; answers whether patient has been diagnosed with STD | Boolean | |
| Number_of_Years_with_STDs | Number of years with STDs acquired as featured risk factor for cervical cancer | Integer | Level: Ratio |
| Is_STD_Condylomatosis | If STD is categorized as condylomatosis | Boolean | |
| Is_STD_Cervical_Condylomatosis | If STD is categorized as cervical condylomatosis | Boolean | |
| Is_STD_Vaginal_Condylomatosis | If STD is categorized as vaginal condylomatosis | Boolean | |
| Is_STD_Vulvoperineal_Condylomatosis | If STD is categorized as vulvoperineal condylomatosis | Boolean | |
| Is_STD_Syphilis | If STD is categorized as syphilis | Boolean | |
| Is_STD_Pelvic_Inflammatory_Disease | IIf STD is categorized as inflammatory disease | Boolean | |
| Is_STD_Genital_Herpes | If STD is categorized as genital herpes | Boolean | |
| Is_STD_Molluscum_Contagiosum | If STD is categorized as molluscum contagiosum | Boolean | |
| Is_STD_AIDS | If STD is categorized as Acquired Immune Deficiency Syndrome (AIDS) | Boolean | |
| Is_STD_HIV | If STD is categorized as Human Immunodeficiency Virus (HIV) | Boolean | |
| Is_STD_Hepatitis_B | If STD is categorized as hepatitis B | Boolean | |
| Is_STD_HPV | If STD is categorized as Human Papillomavirus (HPV) | Boolean | |
| Number_of_STD_Diagnosis | Number of STDs diagnosed as a featured risk factor for cervical cancer | Integer | Level: Ratio |
| Time_Since_First_STD_Diagnosis | Time since first STD diagnosis | Integer | Level: Ratio |
| Time_Since_Last_STD_Diagnosis | Time since last STD diagnosis | Integer | Level: Ratio |
| Is_Diagnosis_Cancer | If patient is diagnosed with cancer or no | Boolean | |
| Is_Diagnosis_CIN | If patient is diagnosed with cervical intraepithelial neoplasia (CIN) or no | Boolean | |
| Is_Diagnosis_HPV | If patient is diagnosed with human papillomavirus (HPV) or no | Boolean | |
| Is_Diagnosed | If patient is diagnosed with | Boolean | |
| Is_Screening_Hinselmann | If screening strategy used to predict the patient's risk of cervical cancer is colposcopy using acetic acid done | Boolean | |
| Is_Screening_Schiller | If screening strategy used to predict the patient's risk of cervical cancer is colposcopy using Lugol iodine | Boolean | |
| Is_Screening_Cytology | If screening used to predict the patient's risk of cervical cancer is Cytology | Boolean | |
| Is_Screening_Biopsy | If screening used to predict the patient's risk of cervical cancer is Biopsy | Boolean |
Data Engineering Overview
We deliver high-quality data
- Each dataset goes through 3 levels of quality review
- 2 Manual reviews are done by domain experts
- Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints
- Data is normalized into one unified type system
- All dates, unites, codes, currencies look the same
- All null values are normalized to the same value
- All dataset and field names are SQL and Hive compliant
- Data and Metadata
- Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters
- Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated
- Data Updates
- Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted
Our data is curated and enriched by domain experts
Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts:
- Field names, descriptions, and normalized values are chosen by people who actually understand their meaning
- Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset
- Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations
- The data is always kept up to date – even when the source requires manual effort to get updates
- Support for data subscribers is provided directly by the domain experts who curated the data sets
- Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution.
Need Help?
- If you have questions about our products, contact us at info@johnsnowlabs.com .
About Us
John Snow Labs , an AI and NLP for healthcare company, provides state-of-the-art software, models, and data to help healthcare and life science organizations build, deploy, and operate AI projects.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Vendor refund policy
No refunds offered. For any questions email us at info@johnsnowlabs.com
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
AWS Data Exchange (ADX)
AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.
Resources
Vendor resources
Similar products



![Cancer Genome Characterization Initiatives - Burkitt Lymphoma, HIV+[...]](https://d1ewbp317vsrbd.cloudfront.net/5ade9cc4-5964-4c9d-a113-c15592c99807.png)