Announcing Data Quality Definition Language (DQDL) enhancements for AWS Glue Data Quality

Posted on: Jun 28, 2024

Customers use AWS Glue Data Quality, a feature of AWS Glue, to measure and monitor quality of their data. They author data quality rules using DQDL to ensure their data is accurate . Customers need the ability to author rules for complex business scenarios that include filter conditions, exclusion conditions, validations for empty values, and composite rules . Previously customers authored SQL to perform these data quality validations in the CustomSQL rule type. Today, AWS Glue announces new set of new enhancements to DQDL that allows data engineers easily author complex data quality rules using native rule types. DQDL now supports

  • NOT operator allowing customers to exclude certain values in their rule.
  • New keywords such as NULL, EMPTY, and WHITESPACES_ONLY to author rules that capture missing values without complex regular expressions.
  • Composite rules for customers to author sophisticated business rules. They can now specify options to manage the evaluation order of these rules.
  • WHERE clause in DQDL to filter data before applying rules.

Refer to DQDL guide for more information.

AWS Glue Data Quality is available in all commercial regions where AWS Glue is available. To learn more, visit the AWS Glue product page and our documentation.