Posted On: Jun 30, 2021

As you prepare your data, AWS Glue DataBrew adds support to automatically identify and mark advanced data types for columns, making it easy to normalize columns containing data of types: Social Security Number (SSN), Email Address, Phone Number, Gender, Credit Card, URL, IP Address, Date and Time, Currency, Zip Code, Country, Region, State, and City. Additionally, DataBrew visually marks columns containing Personally Identifiable Information (PII), allowing you to easily scan for all PII columns in your dataset and apply transformations. Learn more about all supported advanced datatypes.

To assign columns with an advanced data type, you can simply click on the column and DataBrew will automatically identify the data type, generate data validity statistics, and provide recommendations to normalize the data in the column. Once identified, you can use DataBrew’s 250+ built-in transformations such as remove invalid values, replace missing values, and extract custom values to easily prepare your data without writing any code. 

To get started, visit the AWS Management Console or install the DataBrew plugin in your Notebook environment