AWS Cloud Operations Blog
Category: Centralized operations management
Manage AMI updates for AWS Auto Scaling groups with AWS Lambda and AWS Systems Manager
Keeping Amazon Machine Image (AMI) up-to-date with the latest patches and updates is a critical task for organizations using AWS Auto Scaling group . However, manually patching AMIs and updating Auto Scaling groups can be time-consuming for your teams and error-prone. This blog post presents a solution to automate the process of updating AMIs for […]
Operations re:Imagined – Know Before You Go – AWS re:Invent 2024
We are so excited to see you at our annual cloud computing conference, AWS re:Invent 2024 in Las Vegas from Dec 2 to Dec 6. At this conference, you’ll have the opportunity to attend thought-provoking keynotes, dive deep into our services, and meet with fellow cloud enthusiasts! No matter your level of expertise, we’ll have sessions […]
Automate creating and onboarding applications with AWS CloudFormation tags and myApplications
Customers operate hundreds of applications and often those applications consist of hundreds to thousands of resources. This can get complex and overwhelming having to monitor and manage individual resources and identifying what resources are tied to an application while making sure their applications are available, secure, cost-optimized, and performing optimally. The underlying concept of applications […]
Leveraging existing tagging strategies for Application Operations
Customers often spend time finding and managing individual resources within their applications. They need to find various applications, manage and perform application tasks, and monitor resources during different stages of the application lifecycle. Customers usually have hundreds to thousands of resources within even a single AWS account. This requires navigating across multiple AWS services pages […]
How Cigna Implemented a Multi-Region Centralized Alerting System on AWS
This post is co-written with Nicolas Trettel, Cloud Engineering Senior Advisor at Cigna. Monitoring applications and alerting on issues is crucial for building resilient systems. Amazon CloudWatch is a service that monitors applications, responds to performance changes, optimizes resource use, and provides insights into operational health. By collecting data across AWS resources, CloudWatch gives visibility […]
Streamlining the Correction of Errors process using Amazon Bedrock
Generative AI can streamline the Correction of Errors process, saving time and resources. By using generative AI to leverage large language models, combined with the Correction of Errors process, businesses can expedite the identification and documentation of the cause of errors, while saving time and resources. Purpose and set-up The purpose of this blog is […]
Centralized monitoring and alerting for AWS Systems Manager Agent status on managed nodes across AWS Organization
Has the AWS Systems Manager Agent (SSM Agent) running on your critical servers on-premises or on Amazon Elastic Compute Cloud (Amazon EC2) lost healthy connection to AWS Systems Manager (SSM) for some reason and you wanted to be proactively notified when this happens? Do you wish to improve observability of your SSM Agent status and […]
Use AWS Systems Manager Automation runbooks to resolve Elastic Block Store related operational tasks
Customers have been using various forms of automation for years to define a sequence of actions on Amazon Elastic Block Store (EBS). While before, customers were facing operational overhead related to EBS tasks, AWS Systems Manager (SSM) Automations can now be leveraged to meet a wide variety of customer use cases. In this blog post, a […]
Introducing Parameter Store cross-account sharing
Earlier this year, AWS Systems Manager Parameter Store launched a feature that now allows you to share advanced parameters with other AWS accounts, enabling you to centrally manage your configuration data in a multi-account environment. Today, many customers have workloads in multiple AWS accounts that require shared, synchronized configuration data. Now, you can maintain a […]
Get Disk Utilization of Your Fleet Using AWS Systems Manager Custom Inventory Types
Some of my customers need assistance while operating their Amazon Elastic Compute Cloud (Amazon EC2) infrastructure. They need to: Review the disk usage of various volumes/ disks within an EC2 instance. To do it in a scalable way, one does not need to access the instance either through a Remote Desktop Session (RDP) or use […]