How do I remove invalid characters from my Amazon Redshift data?

Last updated: 2020-02-10

There are invalid characters in my Amazon Redshift data. How do I remove them?

Short Description

If your data contains non-printable ASCII characters, such as null, bell, or escape characters, you might have trouble retrieving the data or unloading the data to Amazon Simple Storage Service (Amazon S3). For example, a string that contains a null terminator, such as "abc\0def," is truncated at the null terminator, resulting in incomplete data.

Resolution

Use the TRANSLATE function to remove the invalid character. In the following example, the data contains "abc\u0000def". The TRANSLATE function removes the null character "\u0000" and replaces it with an empty value, which removes it from the string:

admin@testrs=# select a,translate(a,chr(0),'') from invalidstring;
   a    | translate 
--------+-----------
 abc    | abcdef
 abcdef | abcdef
(2 rows)

To remove specified invalid characters from all rows in a table, run the UPDATE command with the TRANSLATE function, as shown in the following example:

admin@testrs=# select * from invalidstring;
   a    
--------
 abc
 abcdef
(2 rows)

admin@testrs=# update invalidstring set a=translate(a,chr(0),'') where a ilike '%'||chr(0)||'%';
UPDATE 1
 
admin@testrs=# select * from invalidstring;
   a    
--------
 abcdef
 abcdef
(2 rows)

Did this article help you?

Anything we could improve?


Need more help?