Posted On: Jun 30, 2021

AWS Glue DataBrew now supports the ability to write datasets created from jobs that run your data preparation recipes directly to the AWS Glue Data Catalog. You can choose to store datasets in Amazon S3, Amazon Redshift, and Amazon RDS (Aurora, Oracle, SQL Server, MySQL, and PostgreSQL) tables in the Data Catalog.

With this feature, you can now directly catalog your cleaned and normalized data when creating your data warehouse or data lake. The Data Catalog will contain an index to the location, schema, and runtime metrics of your data.

To get started, visit the AWS Management Console, or install the DataBrew plugin in your Notebook environment, and refer to the DataBrew documentation.