How can I use AWS DataSync to transfer the data to or from a cross-account Amazon S3 location?

Last updated: 2022-04-26

I want to use AWS DataSync to transfer data to or from a cross-account Amazon Simple Storage (Amazon S3) bucket.

Short description

To use DataSync for cross-account data transfer, do the following:

  1. Use AWS Command Line Interface (AWS CLI) or AWS SDK to create a cross-account Amazon S3 location in DataSync.
  2. Create a DataSync task that transfers data from the source bucket to the destination bucket.

Keep in mind the following limitations when using DataSync to transfer data between buckets owned by different S3 accounts:

  • DataSync doesn't apply the bucket-owner-full-control access control list (ACL) when transferring data to a cross-account destination bucket, leading to object ownership issues in the destination bucket.
  • For a cross-account S3 location, only a cross-account bucket in the same Region is supported. If you use a cross-account and a cross-Region S3 location, then you receive the GetBucketLocation or Unable to connect to S3 endpoint errors.
  • You can't use the cross-account pass role to access the cross-account S3 location.

You can configure the DataSync task in the destination account to pull data from the source by working around the preceding limitations.

Resolution

Perform the required checks

Suppose that the source account has the cross-account source S3 bucket and the destination account has the destination S3 bucket and the DataSync task. Perform the following checks:

AWS Identity and Management (IAM) user/role: Check if the IAM user ot role that you're using to create the cross-account S3 location and the IAM role that you assigned to the S3 location have the required permissions.

Source bucket policy: Be sure that the source bucket policy allows both IAM users/roles in the destination account to access the bucket. The following example policy grants the access to source bucket to both IAM users/roles:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::1111222233334444:role/datasync-config-role",
          "arn:aws:iam::1111222233334444:role/datasync-transfer-role"
        ]
      },
      "Action": [
        "s3:GetBucketLocation",
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads"
      ],
      "Resource": [
        "arn:aws:s3:::example-source-bucket"
      ]
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::1111222233334444:role/datasync-config-role",
          "arn:aws:iam::1111222233334444:role/datasync-transfer-role"
        ]
      },
      "Action": [
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:ListMultipartUploadParts",
        "s3:PutObjectTagging",
        "s3:GetObjectTagging",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::example-source-bucket/*"
      ]
    }
  ]
}

Be sure to replace the following values in the preceding policy:

  • example-source-bucket with the name of the source bucket
  • 1111222233334444 with the account ID of the destination account
  • datasync-config-role with the IAM role that's used for DataSync configuration (example: create a source S3 location and the task in DataSync)
    Note: You might also use an IAM user. This article considers the use of the IAM role.
  • dataysnc-transfer-role with the IAM role that's assigned when creating the source S3 location
    Note: DataSync uses this role to access the cross-account data.

Destination S3 location: Be sure that the destination S3 location is created according to the instructions in Creating a location for Amazon S3.

Use AWS CLI or SDK to create a cross-account source S3 location in DataSync

Note: Creating a cross-account S3 location is not supported in the AWS Management Console.

You can create the cross-account S3 location using either of the following methods:

  • Use a configuration JSON file.
  • Use the options in the AWS CLI command.

Use a configuration JSON file

1.    Create a configuration JSON file input.template for the cross-account S3 location with the following parameters:

{
  "Subdirectory": "",
  "S3BucketArn": "arn:aws:s3:::[Source bucket]",
  "S3StorageClass": "STANDARD",
  "S3Config": {
    "BucketAccessRoleArn": "arn:aws:iam::1111222233334444:role/datasync-transfer-role"
  }
}

2.    Create an S3 location by running the following AWS CLI command:

aws datasync create-location-s3 --cli-input-json file://input.template --region example-DataSync-Region

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

For more information, see create-location-s3.

When the S3 location is created, you see the following output:

{
"LocationArn": "arn:aws:datasync:example-Region:1111222233334444:location/loc-0f8xxxxxxxxe4821"
}

Use the options in the AWS CLI command

Run the following AWS CLI command with appropriate options:

aws datasync create-location-s3 --s3-bucket-arn arn:aws:s3:::example-source-bucket --s3-storage-class STANDARD --s3-config BucketAccessRoleArn="arn:aws:iam::1111222233334444:role/datasync-transfer-role" --region example-DataSync-Region

Be sure to replace the following values in the command:

  • example-source-bucket with the name of the source bucket
  • 1111222233334444 with the account ID of the source account
  • example-DataSync-Region with the Region where you'll be creating the DataSync task.

Create a DataSync task

Configure the DataSync task, and start the task from the DataSync console. For more information, see Creating a task.

Known errors and resolutions

Error: error creating DataSync Location S3: InvalidRequestException: Please provide a bucket in the xxx region where DataSync is currently used

If you receive this error, then confirm that the bucket and IAM policies include the following required permissions:

"Action": [
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListBucketMultipartUploads"
]

If you get this error when using a cross-account bucket, then be sure that the buckets are in the same Region as your DataSync task

S3 object ownership issues

DataSync doesn't support using a cross-account bucket as the destination location. Therefore, you can't use the ACL bucket-owner-full-control. If the DataSync task runs from the source bucket account, the objects uploaded to the destination bucket account might have the object ownership issue. To resolve this issue, if the destination bucket has no objects that are using ACLs, consider disabling the ACLs on the destination bucket. For more information, see Controlling ownership of objects and disabling ACLs for your bucket. Otherwise, it's a best practice to configure the DataSync task in the destination account to pull data from the source.


Did this article help?


Do you need billing or technical support?