How can I create Amazon Redshift Spectrum cross-account access to AWS Glue and Amazon S3?

Last updated: 2020-06-18

I want to use Amazon Redshift Spectrum to access AWS Glue and Amazon Simple Storage Service (Amazon S3) in a different AWS account that's within the same AWS Region. How do I do that?

Short description

To access AWS resources that are in a different account from Amazon Redshift Spectrum, perform the following steps:

1.    In Account 1, create an AWS Identity and Access Management (IAM) role with a trust relationship to Amazon Redshift. This role should be attached to the Amazon Redshift cluster.

2.    In Account 2, create another role with access to both AWS Glue and Amazon S3. This role edits the trust relationship and allows the Amazon Redshift account to assume this role.

3.    In Account 1, edit the trust relationship for the existing role so that it assumes the role created in the AWS Glue and Amazon S3 account.

Resolution

Account 1: Create an AWS Identity and Access Management (IAM) role with a trust relationship to Amazon Redshift

1.    Open the IAM console.

2.    Choose Roles from the navigation pane, and then choose Create role.

3.    For Select type of trusted entity, choose AWS service.

4.    Choose Redshift.

5.    For Select your use case, choose Redshift - Customizable.

6.    Choose Next: Permissions, Next: Tags, and then Next: Review. You don't need to add policies or tags.

7.    Enter a Role name, and then choose Create role. In the following examples, we'll refer to this as redshift_role1.

8.    Associate redshift_role1 with the Amazon Redshift cluster. This association allows your cluster to assume the newly created role to access Amazon S3, Amazon Athena, and AWS Glue.

Account 2: Create another role with access to AWS Glue and Amazon S3

1.    Open the IAM console.

2.    Choose Policies, and then choose Create policy.

3.    Choose the JSON tab, and then enter an IAM policy similar to the following. Replace your_bucket with the name of the S3 bucket that you want to access with Amazon Redshift Spectrum.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation",
        "s3:GetObject",
        "s3:ListMultipartUploadParts",
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": [
        "arn:aws:s3:::your_bucket",
        "arn:aws:s3:::your_bucket/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "glue:CreateDatabase",
        "glue:DeleteDatabase",
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:UpdateDatabase",
        "glue:CreateTable",
        "glue:DeleteTable",
        "glue:BatchDeleteTable",
        "glue:UpdateTable",
        "glue:GetTable",
        "glue:GetTables",
        "glue:BatchCreatePartition",
        "glue:CreatePartition",
        "glue:DeletePartition",
        "glue:BatchDeletePartition",
        "glue:UpdatePartition",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:BatchGetPartition"
      ],
      "Resource": "*"
    }
  ]
}

Note: If the files in your S3 bucket are encrypted, be sure to grant the proper permissions to Amazon Redshift.

4.    Choose Review policy.

5.    Enter a Name for the policy, and then choose Create policy.

6.    From the navigation pane, choose Roles, and then choose Create role.

7.    For Select type of trusted entity, choose Another AWS account.

8.    For Account ID, enter the ID of the account that is using Amazon Redshift. For more information, see Finding your AWS account ID.

9.    Choose Next: Permissions.

10.   In the list of policies, select the check box next to the name of the policy that you created.

11.   Choose Next: Tags and then choose Next: Review. You don't need to add tags.

12.   Enter a Role name, and then choose Create role. In the following examples, we'll refer to this as glue_s3_role2.

13.   From the navigation pane, choose Roles.

14.   Select the name of the role that you created, and then choose the Trust Relationships tab.

15.   Choose Edit Trust Relationship.

16.   Delete the existing policy and then replace it with a policy similar to the following. Replace redshift_account1 with the ID of the account that is using Amazon Redshift. Replace redshift_role1 with the name of the first role that you created.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::redshift_account1:role/redshift_role1"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

17.   Choose Update Trust Policy.

Account 1: Edit the trust relationship for the role created in Amazon Redshift

1.    Open the IAM console.

2.    Choose Roles from the navigation pane, and then choose the name of the first role (redshift_role1) that you created earlier.

3.    Choose Add inline policy.

4.    Choose the JSON tab. Delete the existing policy and then enter an IAM policy similar to the following. Replace glue_s3_account2 with the ID of the account that is using AWS Glue and Amazon S3. Replace glue_s3_role2 with the name of the second role that you created:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1487639602000",
      "Effect": "Allow",
      "Action": [
        "sts:AssumeRole"
      ],
      "Resource": "arn:aws:iam::glue_s3_account2:role/glue_s3_role2"
    }
  ]
}

5.    Choose Review policy.

6.    Enter a name for the policy, and then choose Create policy.

7.    Connect to the Amazon Redshift cluster.

8.    Create an external schema using the roles that you created in both accounts, as shown in the following example:

create external schema spectrum_schema
from data catalog
database 'your_db'
iam_role 'arn:aws:iam::redshift_account1:role/redshift_role1,arn:aws:iam::glue_s3_account2:role/glue_s3_role2'

Replace these values:
your_db: the name of your database in AWS Glue.
redshift_account1: the ID of the Amazon Redshift account.
redshift_role1: the name of the role that you created in the Amazon Redshift account.
glue_s3_account2: the ID of the AWS Glue and Amazon S3 account.
glue_s3_role2: the name of the role that you created in the AWS Glue and Amazon S3 account.

You can now query AWS Glue tables in glue_s3_account2 using Amazon Redshift Spectrum from your Amazon Redshift cluster in redshift_account1, as long as all resources are in the same region. You don't need to recreate your external tables because Redshift Spectrum can access your existing AWS Glue tables.

The query in the following example returns the number of rows in an AWS Glue table created in the external schema:

select count(*) from spectrum_schema.glue_table;

Note: Replace spectrum_schema and glue_table with the name of your schema and AWS Glue table, respectively.


Did this article help you?

Anything we could improve?


Need more help?