AWS Compute Blog

Enhancing multi-account activity monitoring with event-driven architectures

Enterprise cloud environments are growing increasingly complex as they scale, with organizations managing hundreds to thousands of Amazon Web Services (AWS) accounts across multiple business units and AWS Regions. Organizations need efficient ways to collect, transport, and analyze activity data for threat detection and compliance monitoring. This presents unique challenges for enterprise Application Security (AppSec) teams, cloud security vendors, and DevSecOps professionals, because traditional polling-based monitoring approaches struggle to provide real-time activity insights needed for modern cloud operations.

In this post, you will learn to use AWS CloudTrail and Amazon EventBridge for real-time cloud activity monitoring and automated response.

Overview

As organizations expand their cloud footprint, account activity monitoring that comprehensively tracks user actions and successfully identifies security threats becomes crucial for threat detection and compliance. Although AWS provides native tools—such as CloudTrail for API activity capture, EventBridge for real-time event routing, AWS Organizations for multi-account management, and AWS Config for resource evaluation—many enterprises struggle with the volume of activities while maintaining efficiency and controlling costs. Organizations need to carefully architect solutions to effectively use these tools as their environments scale.

Traditional polling-based techniques, which worked well for smaller environments, can become unsustainable when scaled to enterprise deployments, where the volume of activity data grows exponentially with each new account and service. API polling limitations, growing data volumes, and increasing demand for real-time analysis are pushing teams to rethink their architectural approach.

Figure 1. Poll model, periodically retrieving the latest state.

Adopting push-based event-driven architectures offers a compelling solution for AppSec teams and cloud security vendors facing these challenges. Using AWS services, such as CloudTrail and EventBridge, allows these teams and vendors to build scalable activity monitoring solutions that overcome the limitations of traditional polling-based approaches and provide real-time notifications across thousands of AWS accounts. This approach not only enables security use cases but also supports broader real-time operational monitoring, compliance reporting, and automation requirements.

“By integrating AWS CloudTrail and Amazon EventBridge, we’ve built a scalable architecture to monitor activity across thousands of AWS accounts. This provides the visibility needed to detect threats in real time and protect large, distributed AWS environments.” — Rob Solomon, Senior Cloud Solution Architect, CrowdStrike

Solution components

Enterprise AppSec teams and cloud security vendors share common requirements when building multi-account monitoring solutions. They need to efficiently collect activity data across thousands of accounts, transport it to a centralized location for analysis, and process it in real-time to detect threats and compliance violations. The solution must scale seamlessly from dozens to thousands of accounts while remaining highly-performant and cost-efficient. At its core, a scalable multi-account activity monitoring solution consists of three components: activity data collection, cross-account transport to a centralized location, and processing. In the following sections, you will learn how AppSec teams and cloud security vendors can implement each step efficiently while avoiding common pitfalls.

Figure 2. Push model. Account activity is collected at the source, and pushed to the AppSec or cloud security vendor account for further processing.

Data collection strategies

Many teams begin their cloud activity monitoring journey by polling the resource status through service management APIs. Although this approach works good for retrieving the latest resource state on-demand, its fundamental limitation is inability to detect state changes efficiently, necessitating continuous querying of all resources at fixed intervals. Consider a scenario where you’re monitoring 1,000 accounts, with 100 resources in each account. A single polling cycle would necessitate 100,000 API requests, consuming over 28 million API calls daily if running at five-minute intervals. This inefficiency compounds as environments grow, leading to throttling issues, increased costs, and scaling challenges.

AWS Config improved upon this by offering continuous resource configuration tracking without manual polling. Although this works excellent for configuration compliance and a history of changes for auditing, AWS Config reports changes on a best-effort basis and is not optimal for real-time threat detection.

To overcome this constraint, your solution can use services such as CloudTrail and EventBridge as primary data sources, complemented by intelligent on-demand targeted API polling. CloudTrail records API activity across AWS services, providing a detailed history of actions taken by users, roles, and AWS services in your accounts. Over 250 AWS services automatically report their activity and API calls to CloudTrail and EventBridge in real-time. This allows you to capture this information, providing a detailed history of actions taken in your accounts, and enabling security analysis, resource state change tracking, and compliance audit.

Figure 3. Over 250 AWS services are automatically emitting activity events to CloudTrail.

When a resource state changes, commonly as a result of a management API call, the affected service sends an event to CloudTrail and EventBridge. Your monitoring solution can examine the event payload to determine if polling for supplementary data is necessary, particularly when the initial payload lacks complete information. This provides you with comprehensive service coverage with reduced maintenance effort. This hybrid approach guarantees delivery of activity data to eliminate monitoring blind spots, while significantly reducing AWS management API quota consumption.

Cross-account data transport

Your solution should transport activity data from thousands of tenant accounts into a small number of centralized accounts, such as a regional AppSec account, for further processing and analysis. The solution must be secure, scalable, resilient, and cost-efficient while maintaining real-time delivery.

The most direct way to achieve it is to enable Amazon S3 event notifications for new objects that are created in the CloudTrail trails S3 bucket. When you receive the notification, you can retrieve and process new activities.

Figure 4. Exporting CloudTrail events into an S3 bucket and retrieving after receiving a notification.

This direct way to consume CloudTrail events has one important consideration: typically it can take an average of five minutes to deliver events to Amazon S3. Teams and vendors looking for lower mean-time-to-detect (MTTD) and mean-time-to-respond (MTTR) should evaluate transporting CloudTrail events across accounts with EventBridge, which provides close-to-real-time delivery.

Transporting events with EventBridge

EventBridge is a serverless event router that connects applications. It receives events from various sources, such as CloudTrail, and routes them to multiple targets based on defined rules.

Using EventBridge for cross-account data transport comes with several major benefits:

There are two approaches you can take for delivering cross-account events with EventBridge: direct service-to-service or service-to-API-endpoint.

The first approach uses the EventBridge direct bus-to-bus and bus-to-service delivery capabilities. This method is most suitable when you want AWS to handle data ingestion on the receiving end. The delivery target is always either an EventBridge bus, or another AWS service, such as an Amazon Simple Queue Service (Amazon SQS) queue, Amazon Kinesis Data Streams stream, or an AWS Lambda function. With support for up to 18,750 target invocations per second and native AWS Identity and Access Management (IAM) integration, this method is particularly suitable for large multi-account deployments.

The second approach uses the EventBridge API destinations feature. This method is most suitable when you have existing HTTP-based ingestion endpoints in place. Although it offers lower throughput, it provides greater flexibility for ingestion endpoint and authentication methods implementation, making it attractive for AppSec teams and cloud security vendors integrating with existing ingestion infrastructure.

Figure 5. Emitting CloudTrail events in real-time through EventBridge.

The following table summarizes two approaches for transporting events across accounts with EventBridge.

Direct bus-to-bus or bus-to-service API destinations
Data ingestion implementation effort Minimal Needed
Default target invocations per second (TPS) quotas Up to 18,750 (region dependent) Up to 300 (region dependent)
Can the TPS quota be increased Yes Yes
Authorization support Native AWS IAM, fully handled by AWS Basic, OAuth2, API Key. You’re responsible for implementing credentials validation during ingestion.
Cross-account delivery costs $1 per million events $0.20 per million events

Go to the EventBridge quotas and pricing pages for more details.

Processing architecture

Processing would commonly be done by existing products and services the AppSec team or cloud security vendor provides for activity analysis. The architecture for event processing pipeline operating at enterprise scale must consider design decisions to handle large and potentially irregular event volumes while maintaining high performance, as shown in the following figure.

Figure 6. An activity event processing pipeline, with priority-based processing.

Use the following best practices for a robust processing architecture:

  • Buffer ingested events Use services such as Amazon SQS, Amazon Kinesis Data Streams, or Amazon Managed Streaming for Apache Kafka to buffer incoming events, handle traffic surges, and make sure of reliable processing.
  • Use serverless services that scale automatically, or invest in automated scaling mechanisms that adjust processing capacity based on event volume
  • Minimize polling: Resort to intelligent on-demand polling, only poll when you need additional data that is not available in the CloudTrail event payload.
  • Routing and classification: Rather than processing all events equally, implement intelligent classification and routing early in your pipeline. Security-related events such as IAM changes or security group modifications should take priority over routine activities or data events. This approach helps to control costs while maintaining rapid detection of important security events.
  • Cost optimization: At the enterprise scale, cost optimization becomes crucial. Use EventBridge rules in source accounts to filter out irrelevant events before they enter your processing pipeline. Consider implementing regional collection points to optimize data transfer costs. When using Lambda functions for data processing, use batch processing to reduce invocation costs. Evaluate which event types must be delivered in real-time through EventBridge, which event types can be delayed and collected through S3 bucket export, and which events should be discarded.
  • Observability: Monitor the ingestion and processing throughput to react to potential slowdowns early. Detect when source accounts are approaching EventBridge quotas. Consider using AWS Service Quotas to request quota increases automatically through APIs.
  • Cross-Region considerations: Design your architecture to support efficient cross-Region event collection while respecting data sovereignty requirements. Consider implementing regional processing nodes with centralized aggregation for global security analysis.
  • Integration patterns: Modern security solutions must integrate with existing security tools and workflows. Implement standardized output formats that allow seamless integration with SIEM systems, ticketing platforms, and automation frameworks. Consider publishing security findings back to EventBridge buses to enable automated remediation workflows. If you’re a cloud security vendor, then consider integrating with EventBridge as an SaaS partner.

Conclusion

Event-driven architectures present a powerful opportunity for building scalable multi-account activity monitoring solutions. Using services such as AWS CloudTrail and Amazon EventBridge allows teams to overcome the limitations of traditional polling-based approaches while achieving close to real-time delivery.The shift to event-driven security monitoring isn’t just an architectural choice—it’s becoming a necessity for teams operating at enterprise scale. This approach enables security teams to achieve the real-time threat detection capabilities needed in today’s cloud environments while maintaining operational efficiency and cost control.