AWS Cloud Operations Blog

Category: Advanced (300)

How to integrate Amazon Managed Service for Prometheus with Slack

Amazon Managed Service for Prometheus is a serverless Prometheus-compatible monitoring service for metrics to securely monitor container environments at scale. Amazon Managed Service for Prometheus lets you utilize open source Prometheus query language (PromQL) to monitor containerized workload performance without having to manage the underlying infrastructure required for the ingestion, storage, alerting, and querying of […]

Using Amazon Managed Service for Prometheus Alert Manager to receive alerts with PagerDuty

Many customers using Amazon Managed Service for Prometheus are transitioning from their self-managed Prometheus systems to the fully managed service. Within this transition journey, Amazon Managed Service for Prometheus users need ways to migrate their existing Prometheus and Alert Manager configurations. PagerDuty is a receiver used by many customers to route alerts to their internal […]

Using CloudTrail data events with Athena and CloudWatch to create an audit trail for DynamoDB tables events

Highly regulated industries must maintain an audit trail of events at various levels to meet regulatory and industry compliance requirements. Data events provide visibility into the resource operations performed on or in a resource, including object-level API activities such as delete, update, and put items. You can use AWS CloudTrail to create an audit trail […]

Using Prometheus Adapter to autoscale applications running on Amazon EKS

Automated scaling is an approach to scaling up or down workloads automatically based on resource usage. In Kubernetes, the Horizontal Pod Autoscaler (HPA) can scale pods based on observed CPU utilization and memory usage. In more complex scenarios, we would account for other metrics before deciding the scaling. For example, most web and mobile backends […]

Managing the account lifecycle in account-per-tenant SaaS environments on AWS

Managing the account lifecycle in account-per-tenant SaaS environments on AWS

Software as a service (SaaS) companies have many options when they implement multi-tenancy in their applications. The AWS SaaS Factory Program provides recommendations for different deployment patterns depending on factors such as cost, compliance, and end-customer requirements. You might find that silo methods like VPC-per-tenant are not sufficient. Your application might be in a highly […]

Improve your application availability with AWS observability solutions

Distributed systems are complex due to their high number of interconnected components and susceptibility to failures caused by constant updates. Legacy monolithic applications can be distributed across instances and geographic locations or microservices. These rely on thousands of resources to operate and can be updated frequently, scaled elastically, or invoked on demand. In turn, these […]

Implementing a cross-account and cross-Region AWS Config status dashboard

AWS Config helps central IT administrators monitor the compliance of multiple AWS accounts and multiple regions in large enterprises. AWS Config utilizes a configuration recorder to detect changes in your resource configurations and capture these as configuration items. A separate configuration recorder exists for every region in each AWS account. However, AWS Config recorders can […]

Query and visualize Microsoft SQL Server license utilization using Amazon Athena and Amazon QuickSight

Query and visualize Microsoft SQL Server license utilization using Amazon Athena and Amazon QuickSight

In part 1 of this two-part series, I showed you how to deploy a solution to centrally track Microsoft SQL Server licenses in AWS Organizations across multiple AWS accounts and Regions. In this post, I will show you how to query and visualize the aggregated Inventory data using Amazon Athena and Amazon QuickSight to centrally manage your SQL Server licenses. With […]

How Ryanair governs their image distribution using EC2 Image Builder

Ryanair Holdings plc, Europe’s largest airline group, is the parent company of Buzz, Lauda, Malta Air, and Ryanair. Before the COVID-19 pandemic, it carried 149 million guests on more than 2,500 daily flights from more than 80 bases. The Ryanair Group connects over 225 destinations in 37 countries on a fleet of 450 aircraft—and there […]

Collect, aggregate, and analyze Rancher Kubernetes Cluster logs with Amazon CloudWatch

Collect, aggregate, and analyze Rancher Kubernetes Cluster logs with Amazon CloudWatch

Rancher is a popular open-source container management tool utilized by many organizations that provides an intuitive user interface for managing and deploying the Kubernetes clusters on Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Compute Cloud (Amazon EC2). When Rancher deploys Kubernetes onto nodes in Amazon EC2, it uses Rancher Kubernetes Engine (RKE), which is Rancher’s […]