AWS Cloud Operations Blog

Arun Chandapillai

Author: Arun Chandapillai

Arun Chandapillai is a Senior Cloud Architect who is a diversity and inclusion champion. He is passionate about helping his Customers accelerate IT modernization through business-first Cloud adoption strategies and successfully build, deploy, and manage applications and infrastructure in the Cloud. Arun is an automotive enthusiast, an avid speaker, and a philanthropist who believes in ‘you get (back) what you give’.

Amazon CloudWatch Application Signals new enhancements for application monitoring

Amazon CloudWatch Application Signals new enhancements for application monitoring

Today, we’re excited to announce new enhanced features in Amazon CloudWatch Application Signals that simplifies how you monitor large-scale distributed applications. Improvements to CloudWatch Application Signals application map automatically discovers and organizes services into groups based on their relationships, with support for custom grouping that aligns with your business perspective. You can now view the […]

Using Amazon Bedrock and Amazon Nova for AI-Powered Incident Response

In today’s cloud-native world, incident response teams face overwhelming challenges. When critical applications fail, engineers must sift through mountains of observability data across multiple services; all while under intense pressure to restore service quickly. This manual correlation process is time-consuming, error-prone, and often delays resolution, resulting in extended outages and frustrated customers. Traditional monitoring tools […]

Visualizing Amazon DynamoDB data with Amazon OpenSearch Service and Amazon Managed Grafana

Visualizing Amazon DynamoDB data with Amazon OpenSearch Service and Amazon Managed Grafana

High-performance applications with unlimited throughput capabilities pose significant monitoring challenges, especially when tracking real-time metrics, utilization, and throttling events across distributed database workloads. Near real-time visibility into metrics is crucial for application performance and cost optimization. AWS allows you to seamlessly integrate multiple services to tackle these operational complexities. With Amazon DynamoDB, you can build […]

Centralize observability with Amazon Managed Grafana Enterprise plugins

Observability is a critical aspect for maintaining the health and performance of any distributed system. Organizations rely on data from diverse sources, including AWS services as well as third-party ISVs (independent software vendor) to gain insights into their system’s health. Establishing secure connections to these diverse data sources enables visualization and analysis of observability data […]

How SLAs, SLOs, and SLIs interact

Improve application reliability with effective SLOs

At AWS, we consider reliability as a capability of services to withstand major disruptions within acceptable degradation parameters and to recover within an acceptable timeframe. Service reliability goes beyond traditional disciplines, such as availability and performance, to achieve its goal. Components of a system or application will eventually fail over time. Like our CTO Werner Vogels […]

Extend your Amazon Managed Grafana experience with Grafana community plugins

Today, Amazon Managed Grafana announces a new self-service plugin management experience for Grafana community plugins, that enables you to unify data from a wider variety of data sources with visualizations tailored to analyze your unique datasets. Grafana community plugins provide an expansive array of tailor-made solutions to address diverse visualization use cases. With this release, […]

Announcing Amazon Managed Grafana workspace version selection with version 9.4 support

Many customers that use Amazon Managed Grafana have requested for the ability to choose a Grafana version with the latest product features including navigation, dashboards, and visualizations. Today, we are announcing Amazon Managed Grafana workspace version selection with version 9.4 support. Since the product was launched, Amazon Managed Grafana maintained a single version offering globally. […]

Announcing inbound network access control in Amazon Managed Grafana

Many customers that use Amazon Managed Grafana have a need to restrict the Grafana workspace public access and enable fine-grained control to allow which traffic sources can reach the Grafana workspace. Today, we are announcing Amazon Managed Grafana’s new feature that supports inbound network access control. This enables you to secure Grafana workspaces using VPC […]

Implement AWS resource tagging strategy using AWS Tag Policies and Service Control Policies (SCPs)

Implement AWS resource tagging strategy using AWS Tag Policies and Service Control Policies (SCPs)

AWS lets us assign metadata to the AWS resources in the form of tags. Each tag is a simple label consisting of a customer-defined key and a value that makes it easier to manage, search for, and filter AWS resources. Tagging can be an effective scaling mechanism for implementing cloud management and governance strategies. Tags […]

How to integrate Amazon Managed Service for Prometheus with Slack

Amazon Managed Service for Prometheus is a serverless Prometheus-compatible monitoring service for metrics to securely monitor container environments at scale. Amazon Managed Service for Prometheus lets you utilize open source Prometheus query language (PromQL) to monitor containerized workload performance without having to manage the underlying infrastructure required for the ingestion, storage, alerting, and querying of […]