AWS Cloud Operations Blog
Tag: Amazon CloudWatch
Use Amazon CloudWatch Contributor Insights for general analysis of Apache logs
Customers build, deploy, and maintain millions of web applications on AWS and many customers deploy these applications using the Apache web application server. Web application performance is a key metric in modern enterprise applications. On AWS customers leverage Amazon CloudWatch to monitor response times, uptime, and provide SLAs. Engineering teams that run large scale applications […]
Gain operational insights for NVIDIA GPU workloads using Amazon CloudWatch Container Insights
As machine learning models grow more advanced, they require extensive computing power to train efficiently. Many organizations are turning to GPU-accelerated Kubernetes clusters for both model training and online inference. However, properly monitoring GPU usage is critical for machine learning engineers and cluster administrators to understand model performance and to optimize infrastructure utilization. Without visibility […]
Automate CloudWatch Dashboard creation for your AWS Elemental Mediapackage and AWS Elemental Medialive
Introduction Monitoring the health and performance of your media services is critical to ensuring a seamless viewing experience for your customers. Amazon CloudWatch provides powerful monitoring capabilities for Amazon Web Services (AWS) resources. Setting up comprehensive dashboards can be a time-consuming process, especially for organizations managing large number of resources across multiple regions. The Automatic CloudWatch […]
Testing Amazon Cognito backed APIs using Amazon CloudWatch Synthetics
Customers who develop APIs can control access to them using Amazon Cognito user pools as an authorizer. Testing these APIs should take into account the additional security controls in place to effectively validate that the APIs are working, and Amazon CloudWatch Synthetics enables proactive testing of these APIs. If you are using Amazon Cognito User […]
Improve application reliability with effective SLOs
At AWS, we consider reliability as a capability of services to withstand major disruptions within acceptable degradation parameters and to recover within an acceptable timeframe. Service reliability goes beyond traditional disciplines, such as availability and performance, to achieve its goal. Components of a system or application will eventually fail over time. Like our CTO Werner Vogels […]
Respond to CloudWatch Alarms with Amazon Bedrock Insights
Overview When operating complex, distributed systems in the cloud, quickly identifying the root cause of issues and resolving incidents can be a daunting task. Troubleshooting often involves sifting through metrics, logs, and traces from multiple AWS services, making it challenging to gain a comprehensive understanding of the problem. So how can you streamline this process […]
Monitor Java apps running on Tomcat server with Amazon CloudWatch Application Signals (Preview)
Traditionally, Java web applications are packaged into Web Application Resource (WAR) files, which can be deployed on any Servlet/JSP container like Tomcat server. These applications often operate within distributed environments, involving multiple interconnected components such as databases, external APIs, and caching layers. Monitoring the performance and health of Java web applications can be challenging due […]
Testing and debugging Amazon CloudWatch Synthetics canary locally
Introduction Amazon CloudWatch Synthetics canaries are scripts that monitor your endpoints and APIs by simulating the actions of a user. These canaries run on a schedule, check the availability and latency of your applications, and alert you when there are issues. Canary scripts are written in Node.js and Python, and they run inside an AWS […]
Monitor Python apps with Amazon CloudWatch Application Signals (Preview)
AWS announced Amazon CloudWatch Application Signals during re:Invent 2023. It is a new feature to monitor and understand the health of Java applications. Today we are excited to announce that Application Signals now supports Python applications. Enabling Application Signals allows you to use AWS Distro for OpenTelemetry (ADOT) to instrument Python applications without code changes. […]
Using the unified CloudWatch Agent to send traces to AWS X-Ray
Today, applications are more distributed than ever before and they no longer run in isolation. This is especially the case when utilizing Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). A distributed workload or system is one that encompasses multiple small independent components, all working together to complete a task or job. […]