How can I troubleshoot a failed Amazon Redshift data load?

Last updated: 2019-07-17

How can I troubleshoot a failed Amazon Redshift data load?

Short Description

Use the STL_LOAD_ERRORS table to identify errors that occurred during specific loads.

Resolution

Note: The following steps use an example dataset of cities and venues.

1.     Create a raw file named venue_pipe0000_part_00 in an Amazon Simple Storage Service (Amazon S3) bucket with the following content. For more information, see Load LISTING from a Pipe-Delimited File (Default Delimiter).

7|BMO Field|Toronto|ON|0
16|TD Garden|Boston|MA|0
23|The Palace of Auburn Hills|Auburn Hills|MI|0
28|American Airlines Arena|Miami|FL|0
37|Staples Center|Los Angeles|CA|0
42|FedExForum|Memphis|TN|0
52|PNC Arena|Raleigh|NC  ,25   |0
59|Scotiabank Saddledome|Calgary|AB|0
66|SAP Center|San Jose|CA|0
73|Heinz Field|Pittsburgh|PA|65050

2.     Create the table master_schema.venue1 that is similar to the following example:

CREATE TABLE VENUE1(
VENUEID SMALLINT,
VENUENAME VARCHAR(100),
VENUECITY VARCHAR(30),
VENUESTATE CHAR(2),
VENUESEATS INTEGER
) DISTSTYLE EVEN;

3.     Create a load view to diagnose the load failure:

create view loadview as
(select distinct tbl, trim(name) as table_name, query, starttime,
trim(filename) as input, line_number, colname, err_code,
trim(err_reason) as reason
from stl_load_errors sl, stv_tbl_perm sp
where sl.tbl = sp.id);

4.     Use the Copy command to load data. Replace these values in the following example:
your_S3_bucket: the name of your S3 bucket
arn:aws:iam::123456789012:role/redshiftcopyfroms3: the Amazon Resource Name (ARN) for an AWS Identity and Access Management (IAM) role that your cluster uses for authentication and authorization. For more information, see Parameters.

master=# copy venue1
master-# from 's3://your_S3_bucket/venue/'
master-# iam_role 'arn:aws:iam::123456789012:role/redshiftcopyfroms3' delimiter '|' ;

5.     Query the load view for the load details about the table:

master=# select * from loadview where table_name='venue1';
tbl | 265190
table_name | venue1
query | 5790
starttime | 2017-07-03 11:54:22.864584
input | s3://your_S3_bucket/venue/venue_pipe0000_part_00
line_number | 7
colname | venuestate
err_code | 1204
reason | Char length exceeds DDL length

The output from this query includes the following important information:

  • The file causing the error.
  • The column causing the error.
  • The line number in the input file.
  • The reason for the exception.

In the example, the exception is caused by the length value, which must be added to the venuestate column. This value (NC  ,25   |) is longer than the length defined in the DDL VENUESTATE CHAR(2). To resolve this, modify the file to use the correct value:

7|BMO Field|Toronto|ON|0
16|TD Garden|Boston|MA|0
23|The Palace of Auburn Hills|Auburn Hills|MI|0
28|American Airlines Arena|Miami|FL|0
37|Staples Center|Los Angeles|CA|0
42|FedExForum|Memphis|TN|0
52|PNC Arena|Raleigh|NC|0
59|Scotiabank Saddledome|Calgary|AB|0
66|SAP Center|San Jose|CA|0
73|Heinz Field|Pittsburgh|PA|65050

6.     Reload the data load:

master=# copy venue1
from 's3://your_S3_bucket/sales/'
iam_role 'arn:aws:iam::123456789012:role/redshiftcopyfroms3' delimiter '|' ;
INFO:  Load into table 'venue1' completed, 808 record(s) loaded successfully.

Did this article help you?

Anything we could improve?


Need more help?