Skip to main content

Guidance for Enterprise Search and Audit for Amazon S3

Overview

This Guidance helps customers create a single aggregation point for either an enterprise or a disparate collection of AWS accounts that host Amazon Simple Storage Service (Amazon S3) object data. Currently, customers cannot view object-level metadata across an entire organization or search for objects across S3 buckets or accounts. This architecture aggregates object PUT, DELETE, and GET calls into a searchable interface, so customers can search based on object tags, accounts, bucket names, and prefixes. With this search functionality, customers can identify which objects are not encrypted, find S3 buckets that have been inactive for a long period, search object tags, and see read requests on an object level.  

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

You can set up Amazon CloudWatch metrics and alarms for OpenSearch Service to monitor CPU, memory, and storage. All other services in this architecture are serverless and managed by AWS.

Read the Operational Excellence whitepaper 

You can encrypt data in-transit with transport layer security (TLS). You can encrypt data at rest in OpenSearch Service, which is the only component in this architecture that stores data.

Read the Security whitepaper 

OpenSearch Service is the only service in this architecture that permanently stores data and that would require data recovery. OpenSearch Service takes hourly snapshots, or backups of a cluster’s index and state. In the case of disaster recovery, you can use these snapshots to create a new OpenSearch Service cluster. 

Read the Reliability whitepaper 

EventBridge offers cross-account and cross-Region event shipping. Amazon SQS is purpose-built for queueing. Lambda and API Gateway are designed to scale automatically without requiring human intervention.

Read the Performance Efficiency whitepaper 

Serverless services in this architecture such as EventBridge, Amazon SQS, Lambda, and API Gateway use pay as you go pricing, meaning you only pay for the amount of resources you actually use. We recommend using Amazon EC2 Reserved Instances (RIs) for OpenSearch Service, which can provide a discount of up to 72% compared to on-demand pricing. Additionally, Lambda is eligible for Compute Savings Plans, a flexible pricing model that can reduce costs by up to 66%.

Read the Cost Optimization whitepaper 

This architecture uses multiple serverless services, such as Amazon SQS, Lambda, EventBridge, and API Gateway, that offer automatic scaling based on demand. This helps ensure maximum utilization of resources. Although OpenSearch Service is a managed service, it can also be configured to scale based on changes in demand.

Read the Sustainability whitepaper 

Implementation resources

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Open sample code on GitHub

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.