AWS Big Data Blog

Philipp Schmidt

Author: Philipp Schmidt

Test data quality at scale with Deequ

In this blog post, we introduce Deequ, an open source tool developed and used at Amazon. Deequ allows you to calculate data quality metrics on your dataset, define and verify data quality constraints, and be informed about changes in the data distribution. Instead of implementing checks and verification algorithms on your own, you can focus on describing how your data should look.