AWS Management & Governance Blog

Category: Analytics

EMR Cluster

Using AWS Systems Manager Run Command to submit Spark/Hadoop jobs on Amazon EMR

Many customers use Amazon EMR with Apache Spark to build scalable big data pipelines. For large-scale production pipelines, a common use case is to read complex data from a variety of sources. This data must be transformed to make it useful to downstream applications, such as machine learning pipelines, analytics dashboards, and business reports. Such […]

Read More

How to self-service manage AWS Auto Scaling groups and Amazon Redshift with AWS Service Catalog Service Actions

Some of the customers I work with provide AWS Service Catalog products to their end-users to enable self-service for launching and managing Amazon Redshift, EMR clusters or web applications at scale using AWS Auto Scaling groups. These end-users would like the ability to self-manage these resources, for example, be able to take a snapshot of […]

Read More

Analyzing Amazon VPC Flow Log data with support for Amazon S3 as a destination

In a world of highly distributed applications and increasingly bespoke architectures, data monitoring tools help DevOps engineers stay abreast of ongoing system problems. This post focuses on one such feature: Amazon VPC Flow Logs. In this post, I explain how you can deliver flow log data to Amazon S3 and then use Amazon Athena to […]

Read More

How to query your AWS resource configuration states using AWS Config and Amazon Athena

Tracking and managing the states of your AWS resources can be a challenge, especially as your account grows and you integrate with more and more AWS services. AWS Config is a service that helps make tracking your resources easy by continuously monitoring and recording your AWS resource configurations and maintaining a history of configuration changes […]

Read More

Automating the discovery of unused AWS Lambda functions

In 2017 Kyle Somers explained how you can gain visibility into the execution of your AWS Lambda functions in his blog post announcing AWS CloudTrail data events for AWS Lambda. In my blog post, I’ll expand upon Kyle’s post to show you how you can combine CloudTrail data events for AWS Lambda with the power […]

Read More

Analyzing Bitcoin Data: AWS CloudFormation Support for AWS Glue

The AWS CloudFormation team has been busy in the last couple of months, adding support for new resource types for recently released AWS services. In this post, I take a deep dive into using AWS Glue with CloudFormation. About AWS Glue AWS Glue was first announced at re:Invent in 2016, and was made generally available […]

Read More

AWS CloudFormation Feature Updates: Support for Amazon Athena and Coverage Updates for Amazon S3, Amazon RDS, Amazon Kinesis and Amazon CloudWatch

As one of the most widely-used services in AWS, CloudFormation continues to expand its feature set by including adding support for Amazon Athena, two new features to protect stacks and control rollback processes, plus several new coverage updates. CloudFormation now supports the creation of an Amazon Athena named query as a resource. Amazon Athena is a […]

Read More