AWS Cloud Operations Blog
Tag: Monitoring
Improve Amazon Bedrock Observability with Amazon CloudWatch AppSignals
With the pace of innovation with Generative AI applications, there is increasing demand for more granular observability into applications using Large Language Models (LLMs). Specifically, customers want visibility into: Prompt metrics like token usage, costs, and model IDs for individual transactions and operations, apart from service-level aggregations. Output quality factors including potential toxicity, harm, truncation […]
Automate CloudWatch Dashboard creation for your AWS Elemental Mediapackage and AWS Elemental Medialive
Introduction Monitoring the health and performance of your media services is critical to ensuring a seamless viewing experience for your customers. Amazon CloudWatch provides powerful monitoring capabilities for Amazon Web Services (AWS) resources. Setting up comprehensive dashboards can be a time-consuming process, especially for organizations managing large number of resources across multiple regions. The Automatic CloudWatch […]
Enhancing observability with a managed monitoring solution for Amazon EKS
Introduction Keeping a watchful eye on your Kubernetes infrastructure is crucial for ensuring optimal performance, identifying bottlenecks, and troubleshooting issues promptly. In the ever-evolving world of cloud-native applications, Amazon Elastic Kubernetes Service (EKS) has emerged as a popular choice for deploying and managing containerized workloads. However, monitoring Kubernetes clusters can be challenging due to their […]
Respond to CloudWatch Alarms with Amazon Bedrock Insights
Overview When operating complex, distributed systems in the cloud, quickly identifying the root cause of issues and resolving incidents can be a daunting task. Troubleshooting often involves sifting through metrics, logs, and traces from multiple AWS services, making it challenging to gain a comprehensive understanding of the problem. So how can you streamline this process […]
Monitor Java apps running on Tomcat server with Amazon CloudWatch Application Signals (Preview)
Traditionally, Java web applications are packaged into Web Application Resource (WAR) files, which can be deployed on any Servlet/JSP container like Tomcat server. These applications often operate within distributed environments, involving multiple interconnected components such as databases, external APIs, and caching layers. Monitoring the performance and health of Java web applications can be challenging due […]
Testing and debugging Amazon CloudWatch Synthetics canary locally
Introduction Amazon CloudWatch Synthetics canaries are scripts that monitor your endpoints and APIs by simulating the actions of a user. These canaries run on a schedule, check the availability and latency of your applications, and alert you when there are issues. Canary scripts are written in Node.js and Python, and they run inside an AWS […]
VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus
VTEX is a multi-tenant platform with a distributed engineering operation. Observing hundreds of services in real time in an efficient manner is a technical challenge for the business. In this blog, we will show how VTEX created a resilient open source-based architecture aligned with a sharding strategy, using Amazon Managed Service for Prometheus (AMP) to […]
Monitor your AWS resources on your mobile device with AWS Console Mobile Application
AWS customers are increasingly relying on AWS User Notifications to monitor and get real-time notifications about the AWS resources that are most important to them. The AWS Console Mobile Application can be configured as a notification delivery channel, where users can monitor AWS resources, get detailed resource notifications, diagnose issues, and take remedial actions, from […]
Optimize AWS Resource Management with Tag Inventory Reports leveraging AWS Resource Explorer
Customers are increasingly seeking an efficient solution to manage their expanding AWS resources, spanning AWS accounts and Regions, amidst changes like mergers, acquisitions, and cloud migrations. AWS Tags offer an effective solution for organizing, identifying, and filtering resources by categorizing them based on criteria such as purpose, owner, or environment. AWS customers would like to […]
Multi-tenant monitoring across accounts and regions using Amazon Managed Service for Prometheus
In this guest blog post, Nauman Noor (Managing Director), Fabio Dias (Cloud Developer), and Dylan Alibay (Cloud Developer) from the platform engineering team at State Street discuss their use of Amazon Managed Prometheus and AWS Distro for OpenTelemetry to enable monitoring in a multi-tenant, multi-account, and multi-region environment. In the ever-evolving financial services landscape, State […]