AWS Cloud Operations & Migrations Blog

Category: Management & Governance

How to Monitor Databricks with Amazon CloudWatch

This post was written by Lei Pan and Sajith Appukuttan from Databricks. In this post, we look closely at monitoring and alerting systems – both critical components of any production-level environment. We’ll start with a review of the key reasons why engineers should build a monitoring/alerting system for their environment, the benefits, as well as […]

Read More

Deploy Multi-Account Amazon CloudWatch Dashboards

Organizations building modern applications require a way to gain actionable insights into their Amazon Elastic Compute Cloud (Amazon EC2) workloads. Amazon CloudWatch is a monitoring and observability service that collects operational data from logs, metrics, and events. The service lets customers monitor your resources spread across different accounts or regions in a single view, visualize […]

Read More

Resizing volumes and instances using ServiceNow and AWS

The AWS Service Management Connector for ServiceNow enables ServiceNow end users to provision, manage, and operate AWS resources natively through ServiceNow. This lets our customers connect a technical operation with a business workflow, perhaps requiring approvals from management or other teams. The key in all of this is empowering and enabling end-users, thereby removing manual […]

Read More

Mapping Microsoft SCCM compliance checks to AWS Config

Microsoft SCCM (System Center Configuration Manager) enables the management, deployment, and security of devices and applications. Compliance settings in Configuration Manager lets you manage configuration and compliance in your organization. As customers migrate their traditional workloads, they’re also looking for an AWS native solution that provides the flexibility to manage compliance and configuration management on […]

Read More

Viewing custom metrics from statsd with Amazon Managed Service for Prometheus and Amazon Managed Grafana

Monitoring applications based on custom metrics is important for a resilient system. One of the mechanisms to generate custom metrics from applications is statsd – a NodeJs process to collect custom application performance metrics periodically. However, statsd doesn’t provide long-term storage, rich querying, visualization, or an alerting solution. Amazon Managed Service for Prometheus and Amazon […]

Read More

Viewing collectd statistics with Amazon Managed Service for Prometheus and Amazon Managed Service for Grafana

Monitoring systems are essential for a resilient solution. A popular tool to monitor Linux-based physical or virtual machines is collectd – a daemon to collect system and application performance metrics periodically. However, collectd doesn’t provide long-term storage for metrics, rich querying, visualization, or an alerting solution. The Amazon Managed Service for Prometheus is a serverless […]

Read More

Managing your application metadata using AWS Service Catalog App Registry

Customers need a way to track all of their AWS application resources in one place, and associate metadata like cost center, business unit with those resources centrally. AWS Service Catalog AppRegistry removes the need for complex tag management and allows for customers to aggregate application metadata such as cost center and business units across multiple […]

Read More

Integrating existing AWS CloudTrail configurations when launching AWS Control Tower

The customers that we work with often use multiple AWS accounts to meet their business needs. These multi-account environments are built based on the guidelines that AWS published. Customers have created custom mechanisms using AWS Organizations, AWS CloudTrail, and other AWS services to implement the guidelines. AWS Created the AWS Control Tower service as a […]

Read More

DevOps automation for backup compliance in AWS using AWS Backup Audit Manager

Backup compliance in AWS includes defining and enforcing backup policies to encrypt your backups, protect them from manual deletion, prevent changes to your backup lifecycle settings, and audit and report on backup activity from a centralized console. AWS Backup Audit Manager, a feature within the AWS Backup service, provides built-in compliance controls for these areas. […]

Read More

What is observability and Why does it matter? – Part 1

Before defining observability, consider the following example: You run an e-commerce site, and you’re interested in understanding the customer experience of the site, as well as how that translates into sales. You have identified that long page-loading times lead to poor customer experience, which in turn leads customers to abandon their carts and buy competing […]

Read More