Posted On: Jan 11, 2022

You can now use Amazon Redshift Spectrum to specify custom data validation rules for your external tables when querying the Amazon S3 data lake. This enhancement allows you to control how Redshift Spectrum processes data containing unexpected values such as unsupported UTF-8 characters or numeric overflow in your external tables.

Amazon Redshift Spectrum already provides built-in rules to handle unexpected values in your data. For example, Redshift Spectrum sets a column’s value to null when the column contains any unsupported special character and truncates the column’s value when it is wider than the defined column width. Now you can override the built-in rules. For example, you can specify whether to replace the unexpected character, fail the query, or ignore the row when Redshift Spectrum encounters such data. 

To learn more, see Setting data handling options with Redshift Spectrum in the Amazon Redshift Database Developer Guide.