AWS Machine Learning Blog
Tag: Data Preparation
Detect multicollinearity, target leakage, and feature correlation with Amazon SageMaker Data Wrangler
In machine learning (ML), data quality has direct impact on model quality. This is why data scientists and data engineers spend significant amount of time perfecting training datasets. Nevertheless, no dataset is perfect—there are trade-offs to the preprocessing techniques such as oversampling, normalization, and imputation. Also, mistakes and errors could creep in at various stages […]