AWS Cloud Operations Blog

Category: Management Tools

Top Announcements for AWS Cloud Operations at re:Invent 2024

Figure 1. AWS launches new capabilities to help you transform your IT operations. At re:Invent 2024, Nandini Ramani, VP Search, Observability & Cloud Ops, provided a glimpse of how AWS is building the future of cloud operations. The four sections of this blog post cover the top AWS Cloud Operations announcements to help you transform […]

Streamlining AWS Organizations Cleanup Strategies

AWS Organizations provides capabilities for AWS customers to centrally manage accounts in their multi-account environment. As the business landscape evolves, customers may need to close multiple AWS accounts or an entire organization. This could take place during mergers and acquisitions, to support cleanup efforts which reduce cost from unused resources, or decommissioning a venture or […]

Monitor EBS Detailed Performance Statistics with Amazon Managed Service for Prometheus

Today we are excited to announce that you can now easily ingest Amazon EBS detailed performance statistics from your Amazon Elastic Kubernetes Service (Amazon EKS) workloads into an Amazon Managed Service for Prometheus workspace. We recently announced the availability of EBS detailed performance statistics, which gives you real-time visibility into the performance of your EBS […]

Manage AMI updates for AWS Auto Scaling groups with AWS Lambda and AWS Systems Manager

Keeping Amazon Machine Image (AMI) up-to-date with the latest patches and updates is a critical task for organizations using AWS Auto Scaling group . However, manually patching AMIs and updating Auto Scaling groups can be time-consuming for your teams and error-prone. This blog post presents a solution to automate the process of updating AMIs for […]

Leveraging existing tagging strategies for Application Operations

Leveraging existing tagging strategies for Application Operations

Customers often spend time finding and managing individual resources within their applications. They need to find various applications, manage and perform application tasks, and monitor resources during different stages of the application lifecycle. Customers usually have hundreds to thousands of resources within even a single AWS account. This requires navigating across multiple AWS services pages […]

How Cigna Implemented a Multi-Region Centralized Alerting System on AWS

This post is co-written with Nicolas Trettel, Cloud Engineering Senior Advisor at Cigna. Monitoring applications and alerting on issues is crucial for building resilient systems. Amazon CloudWatch is a service that monitors applications, responds to performance changes, optimizes resource use, and provides insights into operational health. By collecting data across AWS resources, CloudWatch gives visibility […]

Operational Best Practices for FedRAMP Compliance in AWS GovCloud with AWS Config

AWS Config is a fully managed service that provides customers with resource inventory, configuration monitoring, and configuration change notifications to support security, governance, and compliance for workloads in AWS. An AWS Config rule represents desired configurations for a resource and evaluates changes in near real-time and records the compliance history in AWS Config. Using AWS […]

Streamlining the Correction of Errors process using Amazon Bedrock

Generative AI can streamline the Correction of Errors process, saving time and resources. By using generative AI to leverage large language models, combined with the Correction of Errors process, businesses can expedite the identification and documentation of the cause of errors, while saving time and resources. Purpose and set-up The purpose of this blog is […]

Scaling AWS Control Tower controls using Amazon Bedrock Agents

Scaling AWS Control Tower controls using Amazon Bedrock Agents

AWS Control Tower is the easiest way to set up and govern a security, multi-account AWS environment. A key feature of AWS Control Tower is to deploy and manage controls at scale across an entire AWS Organizations. These controls are categorized based on their behavior and guidance. The behavior of each control is one of […]

How Stripe architected massive scale observability solution on AWS

This post is co-written with Cody Rioux, Staff Engineer at Stripe and Michael Cowgill, Staff engineer at Stripe Stripe powers online and in-person payment processing and provides financial solutions for businesses of all sizes. Stripe operates a sophisticated microservice environment built on top of AWS. In this blog post we will cover the journey and […]