AWS Cloud Operations Blog
Building a fully automated Dow Jones Asset Tracking System on AWS
Dow Jones is a global provider of news and business information, delivering content to consumers and organizations around the world across multiple formats, including print, digital, mobile and live events. Dow Jones has produced unrivaled quality content for more than 130 years and today has one of the world’s largest news gathering operations globally. It produces leading publications and products including the flagship Wall Street Journal, America’s largest newspaper by paid circulation; Factiva, Barron’s, MarketWatch, Mansion Global, Financial News, Dow Jones Risk & Compliance and Dow Jones Newswires.
The Challenge
Dow Jones (DJ) manages resources across multiple accounts in multiple AWS Regions. The DJ Operations team tracks Compute and Relational Database resources in their Asset Tracking system (a configuration management database or CMDB). This helps the Operations team to effectively monitor resource health, view resource utilization, perform cost analysis and identify resource owners to troubleshoot issues in a given environment. The Operations team was looking for an automated method to maintain resource details in the Asset Tracking system which spans across multiple AWS Regions and over multiple AWS accounts whenever a resource is provisioned, terminated or modified. The resources tracked include Amazon EC2 and AWS Lambda for Compute, and Amazon RDS for databases.
Previously, the Application teams would create requests for the Asset Management team and the Asset Management team, which would then add, update, or remove resources from the system manually. Asset inventory updates would only happen during business hours. While the existing method was adequate for on-premises work, AWS’s dynamic and elastic environments made it difficult to manually track and manage the resource inventory. The Operations team could not rely on the existing manually managed Asset Tracking system data.
The Solution
To overcome these issues, the DJ Operations team developed a fully automated Asset Tracking system to track resources as soon as they are provisioned, modified or terminated. DJ has a comprehensive tagging strategy in place that keeps track of the owner, team, application stack, lifecycle policies and environment. The automated tracking system uses resource metadata tagging to have a comprehensive view of all resources. Since tracking is no longer manual, all information in the Asset Tracking system is up-to-date on the AWS resources.
The DJ Operations team decided to use serverless architecture to avoid having to manage the resources used by the Asset Tracking system. The implemented solution leverages AWS CloudTrail, Amazon S3, AWS Lambda, Amazon SNS, Amazon SQS, Amazon CloudWatch, and IAM.
The following diagram illustrates the Asset Tracking system’s seven-step workflow.
- Enable AWS CloudTrail in all existing accounts across all AWS Regions. DJ’s Landing Zone blueprint includes enablement of CloudTrail for new accounts.
- Set up AWS CloudTrail to deliver all logs to a centralized Amazon S3 bucket that exists in an isolated account with restricted access to ensure integrity of the logs.
- The Amazon S3 bucket triggers events to the Amazon SNS topic for all
PutObject
events. The Amazon SNS topic is used in case there is a future need for a fan-out option for parallel processing. - The S3 processor Lambda processes all incoming CloudTrail events. It parses and filters specific events such as
create resource
,update resource
andterminate resource
to an Amazon SQS queue. TheResource Describer
Lambda function is meant for one-time processing in case of bulk processing of events. - Once relevant events are filtered by the Lambda function, these events are queued and persisted in Amazon SQS to process for asset inventory. It follows a loosely coupled architecture so that applications can process events as they come in and move failed events to a dead letter queue to process and analyze later.
- CloudWatch scheduled events (
CloudWatch Event Rule
) poll the Amazon SQS queue at a minute frequency for new events. Scheduled events also help avoid API limit issues. If the queue has new messages, the Lambda function (SQS Poller
) invokes a processor Lambda function (CMDB pusher
) with a payload of Amazon SQS messages. One Lambda function handles only one event from the queue. The Lambda function reads the event and, depending on resource type, it runs thedescribe
on the resource from Amazon DynamoDB. Thedescribe
call collects all the tagging details for the Asset Tracking system and delivers resource details to the Asset Tracking system as API payload. Once all the steps have completed successfully, the same Lambda function removes the Amazon SQS message from queue. - The Lambda
CMDB pusher
function updates the current state of the resources in DJ’s CMBD.
The Lambda functions were developed using Python and HashiCorp Terraform for infrastructure, and the deployment was done using Jenkins.
An Alternate Solution
You can set up your own Asset Tracking system by running CloudFormation on your own account. This alternate solution uses serverless architecture with automated steps. It can create and terminate events for Amazon EC2 instances and Amazon RDS databases. You can customize it for your own use cases to add more events.
On your account, AWS CloudFormation sets up CloudTrail (an Amazon S3 bucket for storing CloudTrail Events); two Lambda functions for processing events from Amazon S3 and Amazon SQS; Amazon SQS for filtered events; and DynamoDB for asset tracking, bucket policy, roles and permissions.
Permission CloudFormation
The following diagram illustrates this workflow:
- When a new Amazon EC2 or Amazon RDS instance is created, it generates a CloudTrail event for tracking.
- The cross-region CloudTrail event moves to a centralized Amazon S3 bucket for event processing. The Amazon S3 bucket publishes the
s3:ObjectCreated:put
event to Lambda by invoking the Lambda function, as specified in the bucket notification configuration. Because the Lambda function’s access permissions policy includes permissions for Amazon S3 to invoke the function, Amazon S3 can invoke the function. - Lambda executes the
CloudTrailEventProcessor
Lambda function by assuming the execution role created by AWS CloudFormation. The Lambda function reads the Amazon S3 events it receives as a parameter, determines where the CloudTrail object is, reads the CloudTrail object and processes the log records in the CloudTrail object. If the log includes a record with specificeventType
values, it publishes the event to your Amazon SQS for further processing. - Once events arrive at the Amazon SQS queue, it triggers the
EventQueueProcessor
Lambda function for persisting the created resource details. - The
EventQueueProcessor
Lambda function captures the event from the queue, extracts the metadata that are critical for tracking the instance and sends the payload to DynamoDB. - DynamoDB persists the resource details for instance lifecycle and, when the resource gets terminated, the item is removed from DynamoDB.
Conclusion
Using this approach, the Dow Jones Operations team is able to track their resource inventory automatically without manual intervention. Even the Asset Tracking system’s own resources are tracked by the Tracking system itself. Newly created, terminated or modified resources are updated in the inventory system within seven minutes. Above all, the Asset Tracking system is always in current status.
About the authors
Sacheen Shah is a Lead Engineer at Dow Jones. He lives in New Jersey, and helps engineering teams at Dow Jones with how best to deploy their solutions in Cloud using serverless technology and modern methodology. When he isn’t working, he likes playing console games, watching Sci-fi movies and spending time with his family.
Utsav Joshi is a Technical Account Manager at AWS. He lives in New Jersey and enjoys working with AWS customers in solving architectural, operational, and cost optimization challenges. In his spare time, he enjoys traveling, road trips and playing with his kids.