AWS Storage Blog

Automatic monitoring of actions taken on objects in Amazon S3

Administrators may need to monitor and audit actions, like uploads, updates, and deletes, taken on files and other data to comply with regulations or company policies. A scalable and reliable method of tracking and saving actions taken on files can reduce manual work and operational overhead while helping to ensure compliance.

An event-based fanout architectures can help record file activity and perform actions, like triggering archival operations or automated workflows, to help align with compliance requirements. For data stored on Amazon S3, you can leverage such an architecture with other AWS services like AWS Lambda and Amazon SQS.

In this post, we show you how to send S3 Event Notifications triggered by actions taken on objects in an S3 bucket to an Amazon SNS topic that then fans out to an Amazon Kinesis Data Firehose Delivery stream ultimately to send your notification into another S3 bucket for storage. This solution presents an event driven architecture that allows automatic processing and monitoring of actions taken on objects in Amazon S3 that is scalable, reliable, and customizable for your use case.

Overview

In the following solution depicted in the diagram, AWS services (Amazon S3, Amazon SNS, Amazon Kinesis) are integrated to send S3 Event Notifications whenever an activity occurs for objects stored in an Amazon S3 bucket.

Flow diagram for automatic monitoring of actions taken on documents in Amazon S3

  1. An object event is initiated in the S3 bucket. An S3 Event Notification is published to the SNS topic.
  2. The SNS topic receives the S3 Event Notification initiated for the object(s).
  3. SNS will fan out the notification to Amazon Kinesis Data Firehose.
  4. Amazon Kinesis Data Firehose streams the event notification to another S3 bucket.

Refer to the “Cost considerations” section for pricing information relevant to this solution.

Prerequisites

To implement this solution, you must have the following resources:

  • An S3 bucket to store the objects with event notifications enabled.
  • SNS topic.
  • An S3 bucket to store the event notifications of the objects.

To create the S3 buckets, refer to creating your first S3 bucket in the S3 User Guide. To create an SNS topic, refer to creating an Amazon SNS Topic in the SNS User Guide.

Solution walkthrough

This section contains detailed steps on creating and integrating resources required to implement the event driven workflow presented in this blog. This includes:

Step 1 Creating an Amazon Kinesis Data Firehose stream
Step 2 Subscribing the stream to your SNS topic configured to receive S3 Event Notifications
Step 3 Creating an S3 bucket to store the S3 Event Notifications
Step 4 Testing compliance and auditing for object actions on objects stored in Amazon S3

Step 1: Creating an Amazon Kinesis Data Firehose stream

  1. Navigate to the Amazon Kinesis console.
  2. Choose Data Firehose in the navigation panel.
  3. Choose Create delivery stream.
  4. Enter values for the following fields:

a. Source: Direct PUT

b. Delivery stream destination: Select Amazon S3

c. Delivery stream name: The name of your Kinesis Data Firehose delivery stream

Figure 2 Kinesis Data Firehose Console Choosing Source and Destination

5. Scroll down to Destination settings:

a. Under S3 bucket, select Browse.

b. Select an S3 bucket that you intend to use for archiving the object activity events and then select Choose.

Figure 3 Kinesis Data Firehose Console Destination Settings

6. Select Create delivery stream.

Figure 4 Kinesis Data Firehose Create Delivery Stream

Step 2: Subscribing to your SNS topic configured to receive S3 Event Notifications

  1. Complete the prerequisites here to create an IAM role that Amazon SNS can use to push records to Amazon Kinesis Data Firehose.
  2. Complete the steps to subscribe a Kinesis Data Firehose delivery stream to the Amazon SNS topic to be used to fanout object activity events pushed from Amazon S3.

Here are some considerations to note:

  • For throttling errors with the Kinesis Data Firehose protocol, Amazon SNS uses the same delivery policy as for customer managed endpoints.
  • You can subscribe additional application to application (A2A) subscribers to the topic for triggering automated workflows including scripts such as Lambda and SQS to process the published event simultaneously.
  • For monitoring delivery activity from the SNS topic, it is recommended to enable delivery status logging.

Step 3: Creating an S3 bucket to store the S3 Event Notifications

  1. On the Amazon S3 console, navigate to your S3 bucket.
  2. Choose Properties and navigate to Event notifications.
  3. Navigate to Event notifications section and choose Create event notification. For more information, refer to the documentation on enabling and configuring event notifications using the Amazon S3 console.

 

Figure 5 S3 Console Creating Event Notification

4. In the General configuration section, enter a description for the Event name, Prefix – optional and Suffix – optional.

5. In the Event types section, select All object create events, All object removal events, and All restore object events.

Figure 6 S3 Event Notification Event Type Section

6. In the Destination section, choose SNS Topic and specify the SNS topic created earlier. Verify that it has the correct SNS access policy permission to allow S3 to publish S3 Event Notifications. For more information, refer to the documentation on creating an Amazon SNS Topic.

Figure 7 S3 Event Notifcation Destination

After you select Save changes, Amazon S3 sends a test message to the S3 Event Notification destination.

Step 4: Testing compliance and auditing for object actions on objects stored in Amazon S3

First, upload an object to your S3 bucket for compliance and auditing object activity.

  1. On the Amazon S3 console, select your S3 bucket.
  2. Navigate to Object
  3. Choose Upload.
  4. In the Upload window, do one of the following:

a) Drag and drop files to the Upload
b) Choose Add file, choose the files to upload, and choose Open.

5. To upload the listed files, at the bottom of the page, choose Upload.

Now, we’ll review compliance and auditing object activity related event notifications for objects stored in your S3 bucket for compliance and auditing.

1. An S3 Event Notification will be published to the SNS Topic. This can be confirmed by viewing CloudWatch metrics for Amazon SNS by filtering for NumberOfMessagesPublished, NumberOfMessagesDelivered and NumberOfMessagesFailed. You can find more information in the documentation on View CloudWatch metrics for Amazon SNS. Please note that CloudWatch is a distributed environment and it can take some time for metrics to show up.

2. The event notification will be fan out to the Kinesis Firehose Stream. To confirm successful delivery, you can view CloudWatch metrics in the following ways:

a) On the Kinesis console, navigate to Data Firehose. Choose your Delivery Stream. Navigate to monitoring and view Delivery stream metrics. View the Delivery to Amazon S3 success metric to confirm successful delivery your S3 Bucket.

b) On the CloudWatch metric console, on the navigation panel, choose metrics section and select All metrics. Choose the Firehose namespace and select Delivery Stream Metrics. Search for Delivery to Amazon S3 success metric and tick it. Verify that the statistics, period and duration are set to the appropriate values. You can find more information in the documentation on data delivery CloudWatch metrics and on accessing CloudWatch metrics for Kinesis Data Firehose.

3. To view the object event notification, navigate to the S3 console and select your S3 bucket. A folder should be created based on the current year e.g. (2023). If a folder hasn’t been created, wait a few minutes.

4. After selecting the year folder, select the next folder that indicates the month in digit format, e.g. [10 (October)]. The last folder would be the timestamp in UTC format.

5. A file should be stored in this folder in the format DeliveryStreamName-DeliveryStreamVersion-YYYY-MM-dd-HH-MM-SS-RandomString. More information can be found in the documentation on Amazon S3 object name format.

6. When you download or open the file, the format should appear as follows:

{

"Type": "Notification",
"MessageId": "509780d4-adb4-5150-9e15-dd4304bc659b",

"TopicArn": "arn:aws:sns:us-west-2:1234567890:s3complianceaudit",

 "Subject": "Amazon S3 Notification",

"Message": "{\"Records\":[{\"eventVersion\":\"2.1\",\"eventSource\":\"aws:s3\",\"awsRegion\":\"us-west-2\",\"eventTime\":\"2023-10-04T17:54:15.662Z\",\"eventName\":\"ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"AWS:XXXXXXXX:XXXXXXX-XXXXXX\"},\"requestParameters\":{\"sourceIPAddress\":\"XXXXXXXX\"},\"responseElements\":{\"x-amz-request-id\":\"N6PDT6NX774FDFC3\",\"x-amz-id-2\":\"HlCevKZnkENGegZOVihFDxHiGmKns97aLBytRH0urm2MXX92RfROUuCkDf5eBfpS5vnImi7Q1Q0yUzRgA1RmbnP9KsUyHL3B\"},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"snstopic\",\"bucket\":{\"name\":\"s3bucketcomplianceaudit\",\"ownerIdentity\":{\"principalId\":\"A2SE7YNEUS8CN0\"},\"arn\":\"arn:aws:s3:::s3bucketcomplianceaudit\"},\"object\":{\"key\":\"avis-budget-inbound-charges-explained.pdf\",\"size\":421642,\"eTag\":\"e08e4a434b672d206f551df3efd31428\",\"sequencer\":\"00651DA6C799C2748F\"}}}]}",

"Timestamp": "2023-10-04T17:54:16.180Z",

"UnsubscribeURL": "https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:us-west-2:1234567890:s3complianceaudit:41dde2df-85bd-4284-bc09-38ee6dfdd7b5"

}

You can find more information in the documentation on event message structure. You can query data stored in S3 using Amazon Athena. For a brief overview on using Athena with S3, you can refer to this blog, this blog, or the S3 User Guide section on querying S3 Inventory reports with Amazon Athena. You can also use Amazon S3 Select to filter and retrieve data.

Once data is stored here, you can use S3 Lifecycle to configure a set of rules to manage your objects cost effectively.

Cost considerations

Amazon SNS and Kinesis Data Firehose have no upfront costs or minimum fees. Amazon Kinesis Data Streams uses simple pay-as-you-go pricing, and you pay only for the resources you use. For Amazon SNS you pay based on the number of messages that you publish, the number of notifications that you deliver, and any additional API calls for managing topics and subscriptions.

The cost of this solution is $51.75 USD per month ($621 USD per year) based on the following specifications:

Amazon Kinesis Data Firehose: 10 Number of records for data ingestion per second for Direct PUT.
Amazon Simple Notification Service (SNS): 100000 Requests and 100000 Amazon Kinesis Data Firehose Deliveries per month.
Amazon S3: 1 TB Storage and 100000 PUT Request monthly for file stored.
Amazon S3 (compliance and auditing S3 Event Notifications stored): 1 TB Storage and 100000 PUT Request monthly for S3 Event Notifications stored.

Use the AWS Pricing Calculator to get an estimation for your workload.

Cleaning up

Clean up the S3 bucket containing the compliance and auditing events created during the previous steps to make sure you do not incur storage charges. You can do this using the S3 guide here.

Similarly, you should delete the S3 Event Notification setup configured on your object storage bucket so that no future data on document activity events are generated. Following this, using the recommended best practice, you should delete all subscriptions associated with the SNS topic used in the setup before deleting the topic. Lastly, to wrap up the clean up, the unused SNS topic subscribers such as Kinesis Data Firehose Stream can be removed.

Conclusion

In this post, we demonstrated an event driven solution for automatic monitoring of actions taken on objects stored in Amazon S3.

The solution presented helps administrators that may need to monitor and audit actions taken on files and other data to comply with regulations or company polices. This will avoid storing redundant or unnecessary data through monitoring, which will make auditing more efficient and data storage less expensive.

Extend the solution by including additional application to application subscribers to your SNS topic. This will allow for triggering of automated workflows to process the document activity events to compliment the archival capability of Amazon Kinesis.