AWS Cloud Operations Blog
Category: Technical How-to
Enhanced dashboard, latency suggestions in Amazon CloudWatch Internet Monitor
Amazon CloudWatch Internet Monitor provides near-continuous internet measurements for your internet traffic, including availability and performance metrics, tailored to your specific workload footprint on AWS. With Internet Monitor, you can get insights into average internet performance metrics over time, as well as get alerts for issues (health events). You’re notified about events that impact your end […]
Centrally detect and investigate security findings with AWS Organizations integrations
Detecting security risks and investigating the corresponding findings is essential for protecting your AWS environment from potential threats, ensuring the confidentiality, integrity, and availability of your data and resources for your business needs. AWS provides a range of governance and security services such as AWS Organizations, AWS Control Tower, and AWS Config along with many others, […]
Automating metrics collection on Amazon EKS with Amazon Managed Service for Prometheus managed scrapers
Managing and operating monitoring systems for containerized applications can be a significant operational burden for customers such as metrics collection. As container environments scale, customers have to split metric collection across multiple collectors, right-size the collectors to handle peak loads, and continuously manage, patch, secure, and operationalize these collectors. This overhead can detract from an […]
Ingesting administrative logs from Microsoft Azure to AWS CloudTrail Lake
In January 2023, AWS announced the support of ingestion for activity events from non-AWS sources using CloudTrail Lake. Making CloudTrail Lake a single location of immutable user and API activity events for auditing and security investigations. AWS CloudTrail Lake is a managed data lake for capturing, storing, accessing, and analyzing user and API activity on […]
Enable cloud operations workflows with generative AI using Agents for Amazon Bedrock and Amazon CloudWatch Logs
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible […]
Serverless Governance of Software Deployed with AWS Service Catalog
AWS Service Catalog (Service Catalog) is a powerful tool that empowers organizations to manage and govern approved services and resources. It significantly benefits platform engineering by standardizing environments, accelerating service delivery, and enhancing security. With its automated provisioning and resource management, Service Catalog supports infrastructure as code, enabling scalable, reliable deployments. Platform engineering teams are […]
How Amazon CloudWatch Logs Data Protection can help detect and protect sensitive log data
Customer applications running on Amazon Web Services (AWS) often require handling sensitive data such as personally identifiable information (PII) or protected health information (PHI). As a result, sensitive log data can be intentionally or unintentionally logged as part of an application’s observability data. While comprehensive logging is important for application troubleshooting, monitoring and forensics, any […]
Assess Resilience at Scale by using Amazon QuickSight and Amazon Resilience Hub
AWS Resilience Hub helps you to manage and improve the resilience posture of your applications on AWS. It enables you to define your resilience goals, assess your resilience posture against those goals, and implement recommendations for improvement based on the AWS Well-Architected Framework. This benefits individual teams that want to assess their applications. However, for […]
Using Generative AI to Gain Insights into CloudWatch Logs
Have you ever been investigating a problem and opened up a log file and thought “I have no idea what I am looking at. If only I could get a summary of the data.” Observability and log data play an important role in maintaining operational excellence and ensuring the reliability of your applications and services. […]
Deploy AWS Systems Manager Quick Setup programmatically across your AWS Organization
AWS Systems Manager Quick Setup simplifies setting up AWS services, including Systems Manager, by automating common or recommended tasks in your AWS Organization across AWS accounts and Regions. These tasks include, creating required AWS Identity and Access Management (IAM) instance profile roles and setting up operational best practices, such as periodic patch scans and inventory […]