AWS Cloud Operations Blog
Category: Monitoring and observability
How McAfee used Amazon CloudWatch to monitor a multi-PB data migration to Databricks on AWS
This blog post was contributed by Kanishk Mahajan@AWS; Hashem Raslan, Manager, Engineering@McAfee; Anastasia Zamyshlyaeva, Vice President, Data Engineering@McAfee McAfee, a global leader in online protection security enables home users and businesses to stay ahead of fileless attacks, viruses, malware, and other online threats. McAfee wanted to create a centralized data platform as a single source […]
Integrate Amazon CloudWatch metrics with ServiceNow using Amazon Managed Grafana
ServiceNow ITSM is a cloud-based platform designed to improve IT services, increase user satisfaction, and boost IT flexibility and agility. With ServiceNow IT Service Management, you can consolidate your legacy on-premise systems and IT tools into our single data model to transform the service experience, automate workflows, gain real-time visibility, and improve IT productivity. Amazon […]
Monitoring Amazon EMR on EKS with Amazon Managed Prometheus and Amazon Managed Grafana
Apache Spark is an open-source lightning-fast cluster computing framework built for distributed data processing. With the combination of Cloud, Spark delivers high performance for both batch and real-time data processing at a petabyte scale. Spark on Kubernetes is supported from Spark 2.3 onwards, and it gained a lot of traction among enterprises for high performance and […]
Quantify custom application metrics with Amazon CloudWatch Logs and metric filters
Customers have valuable metrics emitted to their logs. Examples include web server response times, slow queries, purchases by partners, custom application metrics, and cache hits or misses. This data has unrealized potential value for increasing observability. Consumed by Amazon CloudWatch Logs and extracted using metric filters, customers can translate this data into actual CloudWatch metrics, […]
Automate time series network visualizations for AWS PrivateLink using Amazon CloudWatch Contributor Insights
AWS PrivateLink is a highly available, scalable technology that lets you connect your Amazon Virtual Private Cloud (VPC) to supported AWS services without requiring public internet traversal. It also lets you privately connect to services hosted by other AWS accounts (VPC endpoint services) and supported AWS Marketplace partner services. Amazon CloudWatch Contributor Insights is a […]
Monitoring underlying hardware failures for EC2 instances by logging them with Amazon OpenSearch Service
With Amazon Elastic Compute Cloud (Amazon EC2) you can spin up a virtual server or instance of various sizes that run on system composed of server, storage, and network hardware. AWS uses status checks to monitor the system on which an EC2 instance runs and detects underlying problems with your instance. These checks are performed […]
How to enable Amazon CloudWatch Alarms to send repeated notifications
Amazon CloudWatch Alarms is natively integrated with Amazon CloudWatch metrics. Many AWS services send metrics to CloudWatch, and AWS also offers many approaches that let you emit your applications’ metrics as custom metrics. CloudWatch Alarms let you monitor the metrics changes when crossing a static threshold or falling out of an anomaly detection band. Furthermore, […]
Use Amazon Cloud Watch math expressions and composite alarms for detailed monitoring of AWS Elastic Load Balancers
AWS Elastic Load Balancing encompasses the following load balancers in AWS: Application Load Balancers, Network Load Balancers, Gateway Load Balancers, and Classic Load Balancers. The load balancer serves as a single contact point for clients and it distributes incoming traffic across multiple targets such as EC2 instances as well as it is crucial to monitor […]
An Observability Journey with Amazon CloudWatch RUM, Evidently, and ServiceLens
Observability means more than just monitoring. At AWS, we consider observability to be an integral component of healthy and secure operations. Two of the newest features of Amazon CloudWatch that enhance observability into your application’s health and operations are Amazon CloudWatch RUM and Amazon CloudWatch Evidently. In this post, we will take you through a […]
Extending and exploring alarm history in Amazon CloudWatch – part 1
Alarm history data can be invaluable in diagnosing trends, impacts and root causes for issues in your application. In this two-part blog series, we will demonstrate how to move beyond the standard 14 day alarm history, and turn your Amazon CloudWatch alarm state changes into logs and metrics that you can graph on your CloudWatch […]