Posted On: Apr 7, 2021

AWS Glue now offers missing value imputation on incomplete datasets. You can use the Fill Missing Values transform to get predicted values for blank entries in a column of your data. This feature makes it easy to clean datasets that have null or empty values so that they don’t need to be accounted for at query time.

Fill Missing Values is a new ML Transform in AWS Glue that learns patterns from the complete rows in your dataset and predicts values for missing data in a column you specify. It works on both categorical and numerical data in tabular data sets, and uses a combination of traditional and machine learning methods to generate a complete column that AWS Glue appends to your data set. The easiest way to get started with Fill Missing Values is by choosing it from the list of transforms in AWS Glue Studio.

The Fill Missing Values transform is available in the same AWS Regions as AWS Glue.

To learn more about this feature, visit our reference documentation and AWS Glue Studio documentation.