How to meet business data resiliency with Amazon S3 cross-Region replication
With Amazon Simple Storage Service (Amazon S3), customers get a high level of availability and durability for their data in every AWS Region. Data stored in any Amazon S3 storage class, except for S3 One Zone-IA, is always stored across a minimum of three Availability Zones, each separated by miles within a Region. For this reason, many AWS customers choose S3 to store their business and application critical data.
Even though Amazon S3 provides regional data resiliency, customers often have compliance and business requirements to replicate their data to a second Region that is hundreds (or even thousands) of miles away from their primary location.
Amazon S3 replication provides an automatic mechanism to make identical copies of your objects in a destination Region of your choice. Replication enables automatic, asynchronous copying of objects across S3 buckets. The same AWS account or different accounts can own buckets configured for object replication. You can copy objects between different Regions or within the same Region.
S3 replication methods
You can replicate objects between different Regions or within the same Region. With S3 Cross-Region replication (CRR), you can copy objects across Amazon S3 buckets in different Regions. Using S3 Same-Region replication (SRR), you are able to copy objects across Amazon S3 buckets in the same AWS Region. Additionally, Amazon S3 replication time control (S3 RTC) helps you meet compliance or business requirements for data replication and provides visibility into Amazon S3 replication activity. S3 RTC replicates most objects that you upload to Amazon S3 in seconds, and 99.99 percent of those objects within 15 minutes. A service level agreement (SLA) backs S3 RTC on the replication of 99.9% of objects within 15 minutes during any billing month.
In this post, I will provide instructions on how to configure S3 cross-Region replication with S3 RTC feature. Additionally, I will do a walk-through of how to configure event notification for S3 replication events and configuring Amazon CloudWatch alarms for the replication metrics.
How S3 replication works
How to configure S3 CRR with S3 RTC
1. Sign in to the AWS Management Console and open the Amazon S3 console. Create a source and destination buckets in your primary and disaster recovery (DR) Regions respectively. Make sure to select enable bucket versioning (and optionally enable encryption with key type SSE-S3 or SSE-KMS) during bucket creation.
2. Create a replication rule on the source bucket.
- Under buckets, search the name of the source bucket and click into the source bucket name.
- Navigate and click into the management tab of the bucket. Under the replication rules section, click create replication rule. This is where you create a replication rule to migrate the existing objects.
- Create replication rule:
- Under the replication rule configuration section, provide a rule name and select the status as enabled to automatically enable the replication rule when created.
- Source bucket section: The Source Bucket name and Source Region are pre-selected for you. Select rule scope (all objects or filters). You can filter objects by prefix, object tags, or a combination of both.
- Destination section: Select a target bucket in a different Region (for CRR). Carefully select between destination bucket in the same account or a different account based on your DR strategy.
- Note: S3 replication gives you the ability to replicate data from one source bucket to multiple destination buckets in the same, or different, Regions by creating multiple replication rules for the same source bucket. We intended S3 replication (multi-destination) for customers that want to create and maintain multiple copies of their data in one or more Regions.
- IAM role section: Select create new role or optionally select a preexisting role for S3 replication. I recommend that you choose create new role to have Amazon S3 create a new IAM role for you. When you save the rule, it generates a new policy for the IAM role that matches the source and destination buckets that you choose. The name of the generated role is based on the bucket names and uses the following naming convention: <replication_role_for_source-bucket_to_destination-bucket>. Alternatively, you can choose to use an existing IAM role. If you do, you must choose a role that grants Amazon S3 the necessary permissions for replication. Replication fails if this role does not grant Amazon S3 sufficient permissions to follow your replication rule.
- Encryption section: You have the option to select replication of objects encrypted using AWS Key Management Service (AWS KMS) by providing the appropriate AWS KMS keys to decrypt the source objects and to encrypt the destination objects.
- Destination storage class section: By default, the storage class of the replicated objects is the same as the source objects. You have the option to override this and select storage class for replicated objects.
- Additional replication options section: S3 RTC helps you meet compliance or business requirements for data replication and provides visibility into Amazon S3 replication times. S3 RTC replicates most objects that you upload to Amazon S3 in seconds, and 99.99 percent of those objects within 15 minutes. Mke sure to select the RTC option if you have a data replication SLA of 15 minutes or less. Additionally, you can also choose to replicate delete markers, which lets you enable or disable the replication of delete markers of objects between source and destination buckets for each replication rule. This is critical for customers that have an active-active architecture across different Regions.
3. Click Save.
Configuring S3 replication events
The Amazon S3 notification feature enables you to receive notifications when certain events happen in your bucket. To enable notifications, you must first add a notification configuration that identifies the events you want S3 to publish and the destinations where you want S3 to send the notifications.
Amazon S3 can publish replication events (among others). Amazon S3 sends event notifications for replication configurations that have S3 RTC enabled. It sends these notifications when an object fails replication, exceeds the 15-minute threshold, replicates after the 15-minute threshold, and misses tracking by replication metrics. It publishes a second event when that object replicates to the destination Region.
|s3:Replication:OperationFailedReplication||You receive this notification event when an object that was eligible for replication using S3 RTC failed to replicate.|
|s3:Replication:OperationMissedThreshold||You receive this notification event when an object that was eligible for replication using S3 RTC exceeded the 15-minute threshold for replication.|
|s3:Replication:OperationReplicatedAfterThreshold||You receive this notification event for an object that was eligible for replication using the S3 RTC feature replicated after the 15-minute threshold.|
|s3:Replication:OperationNotTracked||You receive this notification event for an object that was eligible for replication using S3 RTC but is no longer tracked by replication metrics.|
1. Select the Source Bucket from the Amazon S3 management console. Select properties tab and under event notifications section, click create event notification. Provide an event name and under event type, select replication events. Finally, select the notification destination with an option to select between Amazon Simple Notification Service (Amazon SNS) Topic, Amazon Simple Queue Service (Amazon SQS) queue, and AWS Lambda function and click save changes.
Viewing S3 replication metrics
There are three types of CloudWatch metrics for Amazon S3: storage metrics, request metrics, and replication metrics. Replication metrics turn on automatically when you enable replication with S3 RTC using the AWS Management Console or the Amazon S3 API. Replication metrics are available 15 minutes after you enable a replication rule with S3 RTC.
To view replication metrics
1. Sign in to the AWS Management Console and open the Amazon S3 console.
2. In the buckets list, choose the name of the bucket that contains the objects you want replication metrics for.
3. Choose the metrics tab.
4. Under replication metrics, choose replication rules, and select display charts.
5. Amazon S3 displays replication latency in seconds and operations pending replication in charts. To view all replication metrics, including bytes pending replication, replication latency, and operations pending replication together on a separate page, choose view one more chart.
CloudWatch begins reporting replication metrics 15 minutes after you enable S3 RTC on the respective replication rule. You can view replication metrics on the S3 or CloudWatch console.
You can set alarms in CloudWatch on the three S3 replication metrics. For instance, if the replication lag is greater than 15 minutes (which is the SLA threshold), you could trigger an alert. To do this click the set alarm in CloudWatch link from the S3 replication metrics graph ReplicationLatency.
If you have followed along and created S3 resources for testing purposes, you can delete the buckets, objects, and any replication rules that you configured.
In this post I have reviewed how Amazon S3 replication is an elastic, fully managed, low cost feature that replicates objects between buckets. Customers can use this solution to build a data redundancy capability to meet regulatory compliance, business continuity and disaster recovery requirements. With S3 replication, you can configure S3 to automatically replicate S3 objects across different Regions by using CRR or between buckets in the same AWS Region by using SRR. Customers needing a predictable replication time backed by a service level agreement (SLA) can use S3 RTC to replicate objects in less than 15 minutes.