AWS Cloud Operations & Migrations Blog

Tag: Management and Governance

Creating a correction of errors document

This blog post will walk you through an example of creating a Correction of Errors (COE) document. At Amazon, operational excellence is in our DNA. One best practice that we have learned at Amazon is to have a standard mechanism for post-incident analysis. The COE process facilitates learning from an event to avoid reoccurrences in […]

Using Tag-Based Filtering to Manage AWS Health Monitoring and Alerting at Scale

AWS provides customers regular updates of service notifications and planned activities via e-mail to the root account owners or the operational, security and billing contacts. AWS also provides granular notifications to customers via AWS Health allowing them to fine-tune their alerts on issues relating directly to them. Alongside Health Dashboard’s monitoring capabilities, customers can also […]

Monitor IoT device health at scale with Amazon Managed Grafana­­

Businesses today employ IoT devices to monitor the health of their equipment, ranging from machines on a factory floor to inventory tracking sensor locations. Insights from these IoT device fleets make them part of critical business infrastructure, however deriving meaningful insights from these IoT device fleets at scale is a common challenge customers face. IT […]

AWS Health Events Intelligence Dashboards & Insights

Organizations operating mission-critical workloads on AWS, need the ability to analyze and respond to AWS service events in a timely manner to maintain operational excellence. AWS Health sends AWS Health events on behalf of other AWS services with three main categories: notifications on account administration and security, operational issues that affect AWS services, and scheduled […]

Automate insights for your EC2 fleets across AWS accounts and regions

Automate insights for your EC2 fleets across AWS accounts and regions

Introduction Gaining insights and managing large Amazon Elastic Compute Cloud (Amazon EC2) fleet that is spread across multiple accounts and regions can be a challenging task. It’s crucial to have a quick and efficient method to identify which instances are managed by AWS Systems Manager (SSM) and gather detailed information about the instances that are […]

Centralize AWS Cost Anomaly Detection using Amazon Managed Grafana

AWS Cost Anomaly Detection uses advanced Machine Learning to identify anomalous spend and root causes, empowering the customers to take action quickly. Currently, in order to view the AWS Cost Anomalies in AWS Cost Explorer, it requires the user to have IAM user access privileges on the AWS Management Console. The ability to centrally monitor and […]

Setup memory metrics for Amazon EC2 instances using AWS Systems Manager

Amazon Elastic Compute Cloud (Amazon EC2) emits several metrics for your EC2 instance to Amazon CloudWatch. However, memory metrics isn’t one of the default metrics provided by Amazon EC2. Several memory heavy applications like Big Data Analytics, In-memory Databases, Real-time Streaming require you to monitor memory utilization on the instances for operational visibility. These applications […]

Automated Evidence Collection for Life Sciences continuous compliance solutions using AWS Audit Manager

In the first post of this two-part series, we highlighted how Life Sciences customers can implement a controlled change management process using AWS Systems Manager Change Manager and AWS Config. The solution in our first post, highlighted how a you can follow your Standard Operating Procedures (SOP’s) by implementing approval steps in order to make […]

Automating organizational policies with custom AWS Config Rules and evidence collection in AWS Audit Manager

AWS Config is a service that allows you to evaluate your AWS resources against a desired configuration state using AWS Config Rules. Two types of rules exist, managed rules which are meant to be used out-of-the-box and custom rules for which you define your desired configuration state via code.  AWS Audit Manager can help you […]

Best practices for applying controls with AWS Control Tower

Enabling effective governance in a multi-account environment and aligning with AWS best practices and common compliance frameworks can be a complex endeavor. Many customers, particularly those operating in regulated industries, face the challenge of investing time and resources in identifying risks and developing their own controls to address service relationships and dependencies. This process can […]