Why does my Amazon EMR application fail with an HTTP 403 "Access Denied" AmazonS3Exception?

6 minute read
0

When I submit an application to an Amazon EMR cluster, the application fails with an HTTP 403 "Access Denied" AmazonS3Exception.

Resolution

If permissions are not configured correctly, you might get an "Access Denied" error on Amazon EMR or Amazon Simple Storage Service (Amazon S3). The error looks similar to the following message:

java.io.IOException: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8B28722038047BAA; S3 Extended Request ID: puwS77OKgMrvjd30/EY4CWlC/AuhOOSNsxfI8xQJXMd20c7sCq4ljjVKsX4AwS7iuo92C9m+GWY=), S3 Extended Request ID: puwS77OKgMrvjd30/EY4CWlC/AuhOOSNsxfI8xQJXMd20c7sCq4ljjVKsX4AwS7iuo92C9m+GWY=

First, check the credentials or role specified in your application code

Run the following command on the EMR cluster's master node. Replace s3://doc-example-bucket/abc/ with your Amazon S3 path.

aws s3 ls s3://doc-example-bucket/abc/

If this command is successful, then the credentials or role specified in your application code are causing the "Access Denied" error. Confirm that your application is using the expected credentials, or assuming the expected role, and that it has access to the Amazon S3 path. Verify that the role has permissions to the Amazon S3 path by assuming the AWS Identity and Access Management (IAM) role using the AWS CLI. Then, perform a sample request to the S3 path.

If this command fails, confirm that you're using the most recent version of the AWS Command Line Interface (AWS CLI). Then, check the following to resolve the "Access Denied" error:

Check the policy for the Amazon EC2 instance profile role

If the Amazon Elastic Compute Cloud (Amazon EC2) instance profile doesn’t have the required read and write permissions on the S3 buckets, you might get the “Access Denied” error.

Note: By default, applications inherit Amazon S3 access from the IAM role for the Amazon EC2 instance profile. Be sure that the IAM policies attached to this role allow the required S3 operations on the source and destination buckets.

To troubleshoot this issue, check if you have the required read permission by running the following command:

$ aws s3 ls s3://doc-example-bucket/myfolder/

Your output might look like the following:

An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

-or-

Run the following command:

$ hdfs dfs -ls s3://doc-example-bucket/myfolder

Your output might look like the following:

ls: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: RBT41F8SVAZ9F90B; S3 Extended Request ID: ih/UlagUkUxe/ty7iq508hYVfRVqo+pB6/xEVr5WHuvcIlfQnFf33zGTAaoP2i7cAb1ZPIWQ6Cc=; Proxy: null), S3 Extended Request ID: ih/UlagUkUxe/ty7iq508hYVfRVqo+pB6/xEVr5WHuvcIlfQnFf33zGTAaoP2i7cAb1ZPIWQ6Cc=

Be sure that the instance profile role has the required read and write permissions for the S3 buckets. For example, the S3 actions in the following IAM policy provide the required read and write access to the S3 bucket doc-example-bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListObjectsInBucket",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ]
    },
    {
      "Sid": "AllObjectActions",
      "Effect": "Allow",
      "Action": "s3:*Object*",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket/*"
      ]
    }
  ]
}

Check the IAM role for the EMRFS role mapping

If you use a security configuration to specify IAM roles for EMRFS, then you’re using role mapping. Your application inherits the S3 permissions from the IAM role based on the role-mapping configuration.

The IAM policy attached to these roles must have the required S3 permissions on the source and destination buckets. To specify IAM roles for EMRFS requests to Amazon S3, see Set up a security configuration with IAM roles for EMRFS.

Check the Amazon S3 VPC endpoint policy

If the EMR cluster's subnet route table has a route to an Amazon S3 VPC endpoint, then confirm that the endpoint policy allows the required Amazon S3 operations.

To check and modify the endpoint policy using CLI:

Run the following command to review the endpoint policy. Replace vpce-xxxxxxxx with your VPC ID.

aws ec2 describe-vpc-endpoints --vpc-endpoint-ids "vpce-xxxxxxxx"

If necessary, run the following command to upload a modified endpoint policy. Replace the VPC ID and JSON file path.

aws ec2 modify-vpc-endpoint --vpc-endpoint-id "vpce-xxxxxxxx" --policy-document file://policy.json

To check and modify the endpoint policy using the Amazon VPC console:

  1. Open the Amazon VPC console.
  2. In the navigation pane, choose Endpoints.
  3. Select the Amazon S3 endpoint (the one that's on the EMR cluster's subnet route table). Then, choose the Policy tab to review the endpoint policy.
  4. To add the required Amazon S3 actions, choose Edit Policy.

Check the S3 source and destination bucket policies

Bucket policies specify the actions that are allowed or denied for principals. The source and destination bucket policies must allow the EC2 instance profile role or the mapped IAM role to perform the required Amazon S3 operations.

To check and modify the bucket policies using CLI:

Run the following command to review a bucket policy. Replace doc-example-bucket with the name of the source or destination bucket.

aws s3api get-bucket-policy --bucket doc-example-bucket

If necessary, run the following command to upload a modified bucket policy. Replace the bucket name and JSON file path.

aws s3api put-bucket-policy --bucket doc-example-bucket --policy file://policy.json

To check and modify the bucket policies using the Amazon S3 console:

  1. Open the Amazon S3 console.
  2. Choose the bucket.
  3. Choose the Permissions tab.
  4. Choose Bucket Policy to review and modify the bucket policy.

Accessing S3 buckets in another account

Important: If your application accesses an S3 bucket that belongs to another AWS account, then the account owner must allow your IAM role on the bucket policy.

For example, the following bucket policy gives all IAM roles and users in emr-account full access to s3://doc-example-bucket/myfolder/.

{
  "Id": "MyCustomPolicy",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowRootAndHomeListingOfCompanyBucket",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ],
      "Condition": {
        "StringEquals": {
          "s3:prefix": [
            "",
            "myfolder/"
          ],
          "s3:delimiter": [
            "/"
          ]
        }
      }
    },
    {
      "Sid": "AllowListingOfUserFolder",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ],
      "Condition": {
        "StringLike": {
          "s3:prefix": [
            "myfolder/*"
          ]
        }
      }
    },
    {
      "Sid": "AllowAllS3ActionsInUserFolder",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "s3:*"
      ],
      "Resource": [
        "arn:aws:s3:::doc-example-bucket/myfolder/*",
        "arn:aws:s3:::doc-example-bucket/myfolder*"
      ]
    }
  ]
}

Related information

Why does my Spark or Hive job on Amazon EMR fail with an HTTP 503 "Slow Down" AmazonS3Exception?

Why does my Amazon EMR application fail with an HTTP 404 "Not Found" AmazonS3Exception?

Error responses

How do I troubleshoot 403 Access Denied errors from Amazon S3?

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago