Posted On: Oct 2, 2023
AWS announces general availability of AWS Glue Data Quality in the AWS GovCloud (US-East and US-West) Regions. Glue Data Quality automatically measures and monitors quality of data in data repositories and in AWS Glue ETL pipelines. AWS Glue is a serverless, scalable data integration and ETL (extract, transform, and load) service that makes it easier to discover, prepare, move, and integrate data from multiple sources.
AWS Glue Data Quality helps reduce the need for manual data quality work by automatically analyzing your data to gather data statistics. It uses open-source Deequ to evaluate rules and measure and monitor the data quality of petabyte-scale data lakes. It then recommends data quality rules to get started. You can update recommended rules or add new rules. If data quality deteriorates, you can configure actions to alert users and drill down into the issue’s root cause. Data quality rules and actions can also be configured on AWS Glue data pipelines, helping prevent “bad” data from entering data lakes and data warehouses.
With this general availability, customers can now manage data quality in the AWS GovCloud (US) Regions. To learn more, visit AWS Glue Data Quality.