Why does my Amazon EMR application fail with an HTTP 403 "Access Denied" AmazonS3Exception?

Last updated: 2021-12-20

When I submit an application to an Amazon EMR cluster, the application fails with an HTTP 403 "Access Denied" AmazonS3Exception:

java.io.IOException: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8B28722038047BAA; S3 Extended Request ID: puwS77OKgMrvjd30/EY4CWlC/AuhOOSNsxfI8xQJXMd20c7sCq4ljjVKsX4AwS7iuo92C9m+GWY=), S3 Extended Request ID: puwS77OKgMrvjd30/EY4CWlC/AuhOOSNsxfI8xQJXMd20c7sCq4ljjVKsX4AwS7iuo92C9m+GWY=

Short description

This error indicates that the application tried to perform an Amazon Simple Storage Service (Amazon S3) operation that failed because of a problem with one of the following:

  • The credentials or role specified in your application code
  • The policy attached to the Amazon Elastic Compute Cloud (Amazon EC2) instance profile role
  • The AWS Identity and Access Management (IAM) role for the EMR File System (EMRFS)
  • The Amazon S3 VPC endpoint policy
  • The Amazon S3 source and destination bucket policies

Resolution

Run the following command on the EMR cluster's master node. Replace s3://doc-example-bucket/abc/ with your Amazon S3 path.

aws s3 ls s3://doc-example-bucket/abc/

If this command is successful, then the credentials or role specified in your application code are causing the "Access Denied" error. The credentials or role must have access to the Amazon S3 path.

If this command fails, confirm that you're using the most recent version of the AWS Command Line Interface (AWS CLI). Then, check the following to resolve the "Access Denied" error:

Check the policy for the Amazon EC2 instance profile role

By default, applications inherit Amazon S3 access from the AWS Identity and Access Management (IAM) role for the Amazon EC2 instance profile. Be sure that the IAM policies attached to this role allow the required S3 operations on the source and destination buckets. You might get the "Access Denied" error if the EC2 instance profile (service role for cluster EC2 instances) doesn't have the required read and write permissions on the S3 buckets.

To troubleshoot this issue, check if you have the required read permission by running the following command:

$ aws s3 ls s3://doc-example-bucket/myfolder/

Your output might look like the following:

An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

-or-

Run the following command:

$ hdfs dfs -ls s3://doc-example-bucket/myfolder

Your output might look like the following:

ls: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: RBT41F8SVAZ9F90B; S3 Extended Request ID: ih/UlagUkUxe/ty7iq508hYVfRVqo+pB6/xEVr5WHuvcIlfQnFf33zGTAaoP2i7cAb1ZPIWQ6Cc=; Proxy: null), S3 Extended Request ID: ih/UlagUkUxe/ty7iq508hYVfRVqo+pB6/xEVr5WHuvcIlfQnFf33zGTAaoP2i7cAb1ZPIWQ6Cc=

Be sure that the instance profile role has the required read and write permissions for the S3 buckets. For example, the S3 actions in the following IAM policy provides the required read and write access to the S3 bucket doc-example-bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ListObjectsInBucket",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ]
    },
    {
      "Sid": "AllObjectActions",
      "Effect": "Allow",
      "Action": "s3:*Object*",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket/*"
      ]
    }
  ]
}

Check the IAM role for the EMRFS role mapping

When you use a security configuration to specify IAM roles for EMRFS, you set up role mappings. Each role mapping specifies an IAM role that corresponds to identifiers. These identifiers determine the basis for access to Amazon S3 through EMRFS. The identifiers can be users, groups, or Amazon S3 prefixes that indicate a data location.

When EMRFS makes a request to Amazon S3, and the request matches the basis for access, then EMRFS makes the cluster EC2 instances assume the corresponding IAM role for the request. The IAM permissions attached to that role apply instead of the IAM permissions attached to the service role for cluster EC2 instances. Therefore, the IAM policy attached to these roles must have the required S3 permissions on the source and destination buckets.

Check the Amazon S3 VPC endpoint policy

If the EMR cluster's subnet route table has a route to an Amazon S3 VPC endpoint, then confirm that the endpoint policy allows the required Amazon S3 operations. You can use the AWS CLI or Amazon VPC console to check and modify the endpoint policy.

AWS CLI:

Run the following command to review the endpoint policy. Replace vpce-xxxxxxxx with your VPC ID.

aws ec2 describe-vpc-endpoints --vpc-endpoint-ids "vpce-xxxxxxxx"

If necessary, run the following command to upload a modified endpoint policy. In the example, replace the VPC ID and JSON file path.

aws ec2 modify-vpc-endpoint --vpc-endpoint-id "vpce-xxxxxxxx" --policy-document file://policy.json

Amazon VPC console:

  1. Open the Amazon VPC console.
  2. In the navigation pane, choose Endpoints.
  3. Select the Amazon S3 endpoint (the one that's on the EMR cluster's subnet route table). Then, choose the Policy tab to review the endpoint policy.
  4. To add the required Amazon S3 actions, choose Edit Policy.

Check the S3 bucket policies

Bucket policies specify the actions that are allowed or denied for principals. The bucket policies for the source and destination buckets must allow the EC2 instance profile role or the mapped IAM role to perform the required Amazon S3 operations. You can use the AWS CLI or Amazon S3 console to check and modify the bucket policies.

AWS CLI:

Run the following command to review a bucket policy. Replace doc-example-bucket with the name of the source or destination bucket.

aws s3api get-bucket-policy --bucket doc-example-bucket

If necessary, run the following command to upload a modified bucket policy. In the example, replace the bucket name and JSON file path.

aws s3api put-bucket-policy --bucket doc-example-bucket --policy file://policy.json

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

Amazon S3 console:

  1. Open the Amazon S3 console.
  2. Choose the bucket.
  3. Choose the Permissions tab.
  4. Choose Bucket Policy to review and modify the bucket policy.

Accessing S3 buckets in another account

If your application accesses an S3 bucket that belongs to another AWS account, then the account owner must allow your IAM role on the bucket policy. For example, the following bucket policy gives all IAM roles and users in emr-account full access to s3://doc-example-bucket/myfolder/.

{
  "Id": "MyCustomPolicy",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowRootAndHomeListingOfCompanyBucket",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ],
      "Condition": {
        "StringEquals": {
          "s3:prefix": [
            "",
            "myfolder/"
          ],
          "s3:delimiter": [
            "/"
          ]
        }
      }
    },
    {
      "Sid": "AllowListingOfUserFolder",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Action": [
        "s3:ListBucket"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::doc-example-bucket"
      ],
      "Condition": {
        "StringLike": {
          "s3:prefix": [
            "myfolder/*"
          ]
        }
      }
    },
    {
      "Sid": "AllowAllS3ActionsInUserFolder",
      "Principal": {
        "AWS": [
          "arn:aws:iam::emr-account:root"
        ]
      },
      "Effect": "Allow",
      "Action": [
        "s3:*"
      ],
      "Resource": [
        "arn:aws:s3:::doc-example-bucket/myfolder/*",
        "arn:aws:s3:::doc-example-bucket/myfolder*"
      ]
    }
  ]
}