How do I troubleshoot issues with UNLOAD in Amazon Redshift?

6 分的閱讀內容

I'm trying to unload data from my Amazon Redshift cluster to Amazon Simple Storage Service (Amazon S3). However, I'm getting an error. How I troubleshoot this?

Short description

When unloading data from your Amazon Redshift cluster to your Amazon S3 bucket, you might encounter the following errors:

DB user is not authorized to assume the AWS Identity and Access Management (IAM) Role error

error:  User arn:aws:redshift:us-west-2:<account-id>:dbuser:<cluster-identifier>/<dbuser username> is not authorized to assume IAM Role arn:aws:iam::<account-id>:role/<Role name>

403 Access Denied error

[Amazon](500310) Invalid operation: S3ServiceException:Access Denied,Status 403,Error AccessDenied,

Note: Use the UNLOAD command with the SELECT statement when unloading data to your S3 bucket. Unload the text data in either a delimited or fixed-width format (regardless of the data format used while being loaded). You can also specify whether a compressed gzip file should be filed.

Specified unload destination on S3 is not empty

ERROR: Specified unload destination on S3 is not empty. Consider using a different bucket / prefix, manually removing the target files in S3, or using the ALLOWOVERWRITE option.

This error might happen when you're trying to run the same UNLOAD command and unloading files in the same folder where data files are already present.

For example, you get this error if you run the following command twice:

unload ('select * from test_unload')   
to 's3://testbucket/unload/test_unload_file1' 
iam_role 'arn:aws:iam::0123456789:role/redshift_role';

Resolution

DB user is not authorized to assume the AWS IAM Role error

If the database user isn't authorized to assume the IAM role, then check the following:

Verify that the IAM role is associated with your Amazon Redshift cluster.
Verify that there are no trailing spaces in the IAM role used in the UNLOAD command.
Verify that the IAM role assigned to the Amazon Redshift cluster is using the correct trust relationship.

403 Access Denied error

If you receive a 403 Access Denied error from your S3 bucket, confirm that the proper permissions are granted for your S3 API operations:

"Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:Get*",       
        "s3:List*",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::your_bucket",
        "arn:aws:s3:::your_bucket/*"
      ]
    }
  ]
}

If you're using server-side encryption with S3-managed encryption keys, then your S3 bucket encrypts each of its objects with a unique key. As an additional safeguard, the key itself is also encrypted with a root key that is regularly rotated.

To encrypt the unloaded data at rest, verify that the s3:x-amz-server-side-encryption parameter in your S3 bucket policy is set to "true":

"Condition": {
   "Null": {
           "s3:x-amz-server-side-encryption": "true"
           }

Confirm that KMS_KEY_ID is encrypted so that you can UNLOAD the encrypted data to your S3 bucket:

unload ('select * from a') to 's3://mybucket/test/varchar/' iam_role 'arn:aws:iam::xxxxxx:role/RedshiftS3Access' ALLOWOVERWRITE CSV
KMS_KEY_ID '11111111111111111111111111'
encrypted;

If your S3 bucket is encrypted with a root key from AWS Key Management Service (AWS KMS), then add these permissions to your IAM policy:

"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
]

Specified unload destination on S3 is not empty

The specified unload destination on S3 is not empty error happens if you're trying to UNLOAD files in a folder where the same file name exists.

Choose one of the following options to resolve the error message:

Create a new Amazon S3 bucket or use a different S3 bucket
Change the S3 key prefix
Clean up the files present in the destination Amazon S3 bucket
Use the ALLOWOVERWRITE option with the UNLOAD command

Create a new Amazon S3 bucket or use a different S3 bucket

1. Create a new S3 bucket or identify a different existing S3 bucket.

2. Update the bucket policy of your S3 bucket for your new or existing bucket. The following example policy must be added as a permissions policy:

{
    "Version": "2012-10-17",
    "Id": "ExamplePolicy01",
    "Statement": [
        {
            "Sid": "ExampleStatement01",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::0123456789:user/<UserName>"
            },
            "Action": [
                "s3:GetObject",
                "s3:GetBucketLocation",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::testbucket/*",
                "arn:aws:s3:::testbucket",
                "arn:aws:s3:::testbucket2/*",  --> Additional permission for new bucket
                "arn:aws:s3:::testbucket2"       
            ]
        }
    ]
}

3. Use the new bucket path in the UNLOAD command. In the following example, s3://testbucket2/unload/test_unload_file1 is the new bucket path:

iam_role 'arn:aws:iam::0123456789:role/redshift_role';
unload ('select * from test_unload')   
to 's3://testbucket2/unload/test_unload_file1'

Change the S3 key prefix

Change the S3 Bucket Key prefix in the S3 path when using the UNLOAD command. In the following example, unload2 is the changed prefix:

iam_role 'arn:aws:iam::0123456789:role/redshift_role';
unload ('select * from test_unload')   
to 's3://testbucket/unload2/test_unload_file1'

For more information, see Organizing objects using prefixes.

Clean up the files present in the destination Amazon S3 bucket

Use the CLEANPATH option to remove existing files located in the Amazon S3 path specified in the TO clause before unloading files to the specified location. If you include the PARTITION BY clause, existing files are removed from only the partition folders. This allows the Amazon S3 bucket to receive new files generated by the UNLOAD operation.

Important: Files removed using the CLEANPATH option are permanently deleted and can't be recovered.

To use the CLEANPATH option, you must have the s3:DeleteObject permission on the Amazon S3 bucket. The following example policy must be added as a permissions policy:

{
    "Version": "2012-10-17",
    "Id": "ExamplePolicy01",
    "Statement": [
        {
            "Sid": "ExampleStatement01",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::0123456789:user/<UserName>"
            },
            "Action": [
                "s3:GetObject",
                "s3:GetBucketLocation",
                "s3:ListBucket",
                "s3:DeleteObject"  --> Additional Action added for delete object
            ],
            "Resource": [
                "arn:aws:s3:::testbucket/*",
                "arn:aws:s3:::testbucket"
            ]
        }
    ]
}

Note: You can't specify the CLEANPATH option if you specify the ALLOWOVERWRITE option.

Use the ALLOWOVERWRITE option with the UNLOAD command

The UNLOAD command fails if it finds files that it might possibly overwrite. If ALLOWOVERWRITE is specified, UNLOAD overwrites existing files, including the manifest file. In the following example, the previous files are overwritten and the bucket is updated with the new records.

iam_role 'arn:aws:iam::0123456789:role/redshift_role' allowoverwrite;
unload ('select * from test_unload')   
to 's3://testbucket/unload/test_unload_file1'

To use the ALLOWOVERWRITE option, you must have the s3:PutObject permission on the Amazon S3 bucket. The following example policy must be added as a permissions policy:

{
    "Version": "2012-10-17",
    "Id": "ExamplePolicy01",
    "Statement": [
        {
            "Sid": "ExampleStatement01",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::0123456789:user/<UserName>"
            },
            "Action": [
                "s3:GetObject",
                "s3:GetBucketLocation",
                "s3:ListBucket",
                "s3:PutObject"  --> Additional Action added for overwriting objects
            ],
            "Resource": [
                "arn:aws:s3:::testbucket/*",
                "arn:aws:s3:::testbucket"
            ]
        }
    ]
}