AWS Cloud Operations Blog

Category: Technical How-to

Using AWS CloudTrail data events to audit your Amazon SNS and Amazon SQS workloads

Customers in highly regulated industries, such as Financial Services or Healthcare and Life Sciences, often need to audit every action made in environments with sensitive data. Regulations like HIPAA or FFIEC, and industry frameworks like the PCI DSS, require granular log entries that record user and administrative actions within an environment containing sensitive data, and […]

Elevating Your AWS Observability: Unlocking the Power of Amazon CloudWatch Alarms

Organizations commonly leverage AWS services to enhance the observability and operational excellence of their workloads. However, often it is unclear the actions that teams should take when observability metrics are delivered to them, it can be difficult to understand which metrics need action to remediate and which ones are simply noise. For example, if an […]

Automate your Multicloud operations with AWS Systems Manager and AWS Lambda

A multicloud strategy presents various challenges, including observing and managing applications and infrastructure across multiple cloud platforms. Maintaining consistent tooling for visualizing operational data and automating actions helps organizations address this challenge. Amazon CloudWatch and AWS Systems Manager are two services that provide unified monitoring, observability, and automation capabilities for workloads deployed on AWS, on-premises, […]

Service Catalog engine

Developing an AWS Service Catalog self-managed engine for governance

AWS Service Catalog lets you centrally manage your cloud resources to achieve governance at scale of your Infrastructure as Code (IaC) templates. AWS Service Catalog supports AWS CloudFormation natively and allows customers to use other IaC such as Terraform Community and Terraform Cloud via Service Catalog reference engine. We often hear customers asking how to […]

featured image

How to perform Failover and Failback using AWS Elastic Disaster Recovery (AWS DRS) between VMware and AWS environments

Enterprises face a variety of threats such as natural disasters, cyber-attacks and technology failures that could severely disrupt operations. A comprehensive disaster recovery plan is crucial to quickly respond and recover from these events. In this blog post, we’ll show how to plan and implement a comprehensive disaster recovery solution between your VMware on-premises environment […]

Introducing Parameter Store cross-account sharing

Earlier this year, AWS Systems Manager Parameter Store launched a feature that now allows you to share advanced parameters with other AWS accounts, enabling you to centrally manage your configuration data in a multi-account environment. Today, many customers have workloads in multiple AWS accounts that require shared, synchronized configuration data. Now, you can maintain a […]

Getting started with myApplications for Terraform-managed applications

AWS customers often operate hundreds of applications and have to monitor and manage individual resources to make sure their applications are available, secure, cost-optimized, and performing optimally. In this blog post, we will walk through how to use Terraform to create an application for use with myApplications, add resources to new and existing applications, and strategies for scaling application management using Terraform.

Assess secure Windows Servers for TCO analysis using Migration Evaluator

Summary In this blog post, we explore an approach that leverages Windows operating system tools to extract critical metric data directly from Windows Servers. At Amazon Web Services (AWS), we offer the Migration Evaluator agentless collector and AWS Application Discovery Service to facilitate workload discovery. However, some customers run highly secure workloads where deploying assessment tools, enabling […]

Centralize observability with Amazon Managed Grafana Enterprise plugins

Observability is a critical aspect for maintaining the health and performance of any distributed system. Organizations rely on data from diverse sources, including AWS services as well as third-party ISVs (independent software vendor) to gain insights into their system’s health. Establishing secure connections to these diverse data sources enables visualization and analysis of observability data […]

Understanding AWS High Availability and Replication for vSphere Administrators

Introduction vSphere HA is a fundamental and frequently used feature of vSphere. If any of several failure scenarios occur, it restarts a virtual machine. The failure scenarios range from VM or host crashes to unresponsive hosts (for example, due to network isolation or outage). Translating vSphere High Availability (HA) to the public cloud can be […]