This Guidance helps customers create a single aggregation point for either an enterprise or a disparate collection of AWS accounts that host Amazon Simple Storage Service (Amazon S3) object data. Currently, customers cannot view object-level metadata across an entire organization or search for objects across S3 buckets or accounts. This architecture aggregates object PUT, DELETE, and GET calls into a searchable interface, so customers can search based on object tags, accounts, bucket names, and prefixes. With this search functionality, customers can identify which objects are not encrypted, find S3 buckets that have been inactive for a long period, search object tags, and see read requests on an object level.
Any account (1, 2, or N) generates an event for Amazon Simple Storage Service (Amazon S3) operations, such as GET, PUT, DELETE, or storage tier updates.
The event gets recorded in Amazon EventBridge and in an Amazon S3 logging bucket for the specific event source bucket.
EventBridge in the individual account (1, 2 or N) sends data to EventBridge in the AWS aggregation account, and the event is then forwarded to Amazon Simple Queue Server (Amazon SQS).
AWS Lambda functions process events from Amazon SQS in batches. Lambda functions create HEAD requests to the source bucket to get the metadata of each object. For GET requests, Lambda functions process log files from the logging buckets to record GET requests.
All data is pushed to an Amazon OpenSearch Service cluster, which hosts metadata for all objects.
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
You can set up Amazon CloudWatch metrics and alarms for OpenSearch Service to monitor CPU, memory, and storage. All other services in this architecture are serverless and managed by AWS.
You can encrypt data in-transit with transport layer security (TLS). You can encrypt data at rest in OpenSearch Service, which is the only component in this architecture that stores data.
OpenSearch Service is the only service in this architecture that permanently stores data and that would require data recovery. OpenSearch Service takes hourly snapshots, or backups of a cluster’s index and state. In the case of disaster recovery, you can use these snapshots to create a new OpenSearch Service cluster.
EventBridge offers cross-account and cross-Region event shipping. Amazon SQS is purpose-built for queueing. Lambda and API Gateway are designed to scale automatically without requiring human intervention.
Serverless services in this architecture such as EventBridge, Amazon SQS, Lambda, and API Gateway use pay as you go pricing, meaning you only pay for the amount of resources you actually use. We recommend using Amazon EC2 Reserved Instances (RIs) for OpenSearch Service, which can provide a discount of up to 72% compared to on-demand pricing. Additionally, Lambda is eligible for Compute Savings Plans, a flexible pricing model that can reduce costs by up to 66%.
This architecture uses multiple serverless services, such as Amazon SQS, Lambda, EventBridge, and API Gateway, that offer automatic scaling based on demand. This helps ensure maximum utilization of resources. Although OpenSearch Service is a managed service, it can also be configured to scale based on changes in demand.
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.