AWS Cloud Operations Blog

Category: *Post Types

Analyzing your custom metrics spend contributors in Amazon CloudWatch

With an ever-growing volume of custom metrics in Amazon CloudWatch, customers often find it difficult to understand and manage their spend on this service. One of the most common questions they have is how to identify which metrics contribute the most to their spend in CloudWatch. This blog post introduces a solution that lets you […]

Enhanced dashboard, latency suggestions in Amazon CloudWatch Internet Monitor

Amazon CloudWatch Internet Monitor provides near-continuous internet measurements for your internet traffic, including availability and performance metrics, tailored to your specific workload footprint on AWS. With Internet Monitor, you can get insights into average internet performance metrics over time, as well as get alerts for issues (health events). You’re notified about events that impact your end […]

Centrally detect and investigate security findings with AWS Organizations integrations

Detecting security risks and investigating the corresponding findings is essential for protecting your AWS environment from potential threats, ensuring the confidentiality, integrity, and availability of your data and resources for your business needs. As shown in Image 1, effective incident response follows a systematic approach of identifying, detecting, investigating, prioritizing, and resolving security findings. By analyzing […]

Automating metrics collection on Amazon EKS with Amazon Managed Service for Prometheus managed scrapers

Managing and operating monitoring systems for containerized applications can be a significant operational burden for customers such as metrics collection. As container environments scale, customers have to split metric collection across multiple collectors, right-size the collectors to handle peak loads, and continuously manage, patch, secure, and operationalize these collectors. This overhead can detract from an […]

Ingesting administrative logs from Microsoft Azure to AWS CloudTrail Lake

In January 2023, AWS announced the support of ingestion for activity events from non-AWS sources using CloudTrail Lake. Making CloudTrail Lake a single location of immutable user and API activity events for auditing and security investigations. AWS CloudTrail Lake is a managed data lake for capturing, storing, accessing, and analyzing user and API activity on […]

Enable cloud operations workflows with generative AI using Agents for Amazon Bedrock and Amazon CloudWatch Logs

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible […]

Serverless Governance of Software Deployed with AWS Service Catalog

AWS Service Catalog (Service Catalog) is a powerful tool that empowers organizations to manage and govern approved services and resources. It significantly benefits platform engineering by standardizing environments, accelerating service delivery, and enhancing security. With its automated provisioning and resource management, Service Catalog supports infrastructure as code, enabling scalable, reliable deployments. Platform engineering teams are […]

How Amazon CloudWatch Logs Data Protection can help detect and protect sensitive log data

Customer applications running on Amazon Web Services (AWS) often require handling sensitive data such as personally identifiable information (PII) or protected health information (PHI). As a result, sensitive log data can be intentionally or unintentionally logged as part of an application’s observability data. While comprehensive logging is important for application troubleshooting, monitoring and forensics, any […]

Leveraging AWS CloudTrail Insights for Proactive API Monitoring and Cost Optimization

Leveraging AWS CloudTrail Insights for Proactive API Monitoring and Cost Optimization

AWS CloudTrail Insights is a powerful feature within AWS CloudTrail that helps organizations identify and respond to unusual operational activity in their AWS accounts. This includes identifying spikes in resource provisioning, bursts of IAM actions, or gaps in periodic maintenance activity. CloudTrail Insights continuously analyzes CloudTrail management events from trails and event data stores, establishing […]

Assess Resilience at Scale by using Amazon QuickSight and Amazon Resilience Hub

AWS Resilience Hub helps you to manage and improve the resilience posture of your applications on AWS. It enables you to define your resilience goals, assess your resilience posture against those goals, and implement recommendations for improvement based on the AWS Well-Architected Framework. This benefits individual teams that want to assess their applications. However, for […]