AWS Cloud Operations Blog
Category: Management & Governance
Monitoring AWS Lambda errors using Amazon CloudWatch
When we troubleshoot failed invocations from our Lambda functions, we often must identify the invocations that failed (from among all of the invocations), identify the root cause, and reduce mean time to resolution (MTTR). In this post, we will demonstrate how to utilize Amazon CloudWatch to identify failed AWS Lambda invocations. Likewise, we will show how […]
Visualize Amazon EC2 based VPN metrics with Amazon CloudWatch Logs
Organizations have many options for connecting to on-premises networks or third parties, including AWS Site-to-Site VPN. However, some organizations still need to use an Amazon Elastic Compute Cloud (Amazon EC2) instance running VPN software, such as strongSwan. Gaining insight into Amazon EC2-based VPN metrics can be challenging when compared to AWS native VPN services that […]
Create metrics and alarms for specific web pages with Amazon CloudWatch RUM
Amazon CloudWatch RUM makes it easy for AWS customers to access real-world performance metrics from web applications, thereby giving insights into the end-user experience. These user experiences are quantified into discrete metrics that you can then create alarms for. But what if you must have different load time alarms for certain pages? Or you’re testing […]
Proactive autoscaling of Kubernetes workloads with KEDA using metrics ingested into Amazon Managed Service for Prometheus
UPDATE: This blog post has been published to include information about the recently added support for KEDA with the Amazon Managed Service for Prometheus (AMP).” Orchestration platforms such as Amazon EKS and Amazon ECS have simplified the process of building, securing, operating, and maintaining container-based applications, thereby helping organizations focus on building applications. We simplified this further […]
How to fix SSH issues on EC2 Linux instances using AWS Systems Manager
In a previous blog post, we provided a walkthrough of how to fix unreachable Amazon EC2 Windows instances using the EC2Rescue for Windows tool. In this blog post, I will walk you through how to utilize EC2Rescue for Linux to fix unreachable Linux instances. This Knowledge Center Article describes how EC2Rescue for Linux can be used to […]
Identify operational issues quickly by using Grafana and Amazon CloudWatch Metrics Insights (Preview)
Amazon CloudWatch has recently launched Metrics Insights (Preview) – a fast, flexible, SQL-based query engine that enables you to identify trends and patterns across millions of operational metrics in real-time. With Metrics Insights, you can easily query and analyze your metrics to gain better visibility into the health and performance of your infrastructure and large scale […]
Introducing AWS AppConfig Feature Flags In Preview
Update (15 March 2022): AWS AppConfig Feature Flags are now generally available. The information below is still correct, but additional information can be found in the link at the end of this blog post. Modern DevOps practices require development teams to continuously iterate their applications based on customer feedback. These iterations are mostly comprised of […]
Monitoring Service Level Objectives (“SLOs”) Made Easier with Nobl9 and Amazon CloudWatch Metrics Insights
The updated version (June 2022) that follows is based on working backward from a customer need to understand Service Level Objectives (“SLOs”) and the benefits from monitoring SLOs. This post was originally written in Nov 2021 by Natalia Sikora-Zimna, Product Owner at Nobl9. A service can be provided by infrastructure, a platform, software, or people. […]
AWS attendee guide for Cloud Operations track at re:Invent 2021
AWS re:Invent is a learning conference hosted by Amazon Web Services (AWS) for the global cloud computing community. We are super excited to join you at the 10th annual re:Invent to share the latest from AWS leaders and discover more ways to learn and build. Let’s celebrate this milestone which will be offered in person […]
How to validate authentication using Amazon CloudWatch Synthetics – Part 2
In the second post of this two-part series, I will demonstrate how to utilize the Amazon CloudWatch Synthetics canary that uses the multiple HTTP endpoints blueprint in order to monitor an application requiring an authentication certificate. The first post Multi-step API monitoring using Amazon CloudWatch Synthetics provided steps to create an Amazon CloudWatch Synthetics script for executing a […]