Posted On: Nov 18, 2021

AWS Glue DataBrew users can now create data quality rules, which are customizable validation checks that define business requirements for specific data. You can create rules to check for duplicate values in certain columns, validate that one column does not match another, or define many more custom checks and conditions based on your specific data quality use cases. You can group rules for a given dataset into a ruleset for efficiency and apply these checks as part of a standard data profile job. Results are populated in a data quality dashboard and validation report, helping you to quickly view rule outcomes and determine whether your data is fit for use.

AWS Glue DataBrew is a visual data preparation tool that makes it easy to clean and normalize data using over 250 pre-built transformations, all without the need to write any code. You can automate filtering anomalies, converting data to standard formats, correcting invalid values, and other tasks.

To get started with DataBrew, visit the AWS Management Console or install the DataBrew plugin in your Notebook environment. To learn more, view this getting started video and refer to the DataBrew documentation.