AWS Cloud Operations & Migrations Blog

Category: Learning Levels

Reversing Technical Debt with Cloud

This blog post covers best practices to manage and reverse technical debt by prudently leveraging and operating cloud services. Technical debt is a metaphor coined by Ward Cunningham, to deal with the cost of making tradeoffs in software development to meet near-term business needs. In the case of financial debt, you take a loan to […]

Delegate AWS Organizations policy management in a multi-account environment

AWS Organizations helps you centrally manage and govern multiple AWS accounts within AWS. You can manage organization structure, add and remove accounts, define configuration using policies, handle consolidated billing, and control multi-account features of integrated AWS services. As your environment grows, your administrators have to manage more accounts and policies which often requires coordination between […]

Using Amazon CloudWatch metrics to monitor time to expiration for Reserved Instances | Amazon Web Services

This post shows you how to monitor the days remaining for Amazon EC2 Reserved Instances. The solution uses a custom Amazon CloudWatch metric published via an AWS Lambda function. It creates a CloudWatch alarm and an Amazon Simple Notification Service (Amazon SNS) topic for notification when the alarm exceeds the user-defined threshold. CloudWatch allows you […]

Automate AWS Config reporting for noncompliant resources that have been non-compliant for a period of time

AWS Config evaluates the configuration settings of your AWS resources. You do this by creating AWS Config rules, which represent your ideal configuration settings. AWS Config provides customizable, predefined rules called AWS Managed Rules to help you get started. While AWS Config continuously tracks the configuration changes that occur among your resources, it checks whether […]

How Capgemini used AWS Systems Manager and AWS cloud native observability to provide self-service monitoring

This post was written in collaboration with David Wansell, an Enterprise Cloud Architect at Capgemini with over 20 years of experience across multiple enterprise domains. He designs and builds automation and solutions that enable customers to deliver on their desired outcomes in their cloud adoption journey. Customers need a way to automatically create alarms that […]

How Capgemini used AWS Systems Manager and AWS cloud native observability to provide self-service logging and analytics

This post was written in collaboration with David Wansell, an Enterprise Cloud Architect at Capgemini with over 20 years of experience across multiple enterprise domains. He designs and builds automation and solutions that enable customers to deliver on their desired outcomes in their cloud adoption journey. Log analysis helps customers to manage infrastructure and applications […]

Using AWS Cost Explorer and cost allocation tags to view Amazon S3 costs by bucket

AWS customers often have many users and groups within their organization utilizing Amazon Simple Storage Service (Amazon S3) buckets. In addition, customers often need a way to accurately understand the costs on a per-bucket basis for cost observability and charge back mechanisms. This is also important if a customer is entering the AWS Migration Acceleration […]

Monitoring Amazon RDS and Amazon Aurora using Amazon Managed Grafana

Organizations running critical applications on AWS using fully managed database services such as Amazon Relational Database Service (Amazon RDS) and Amazon Aurora rely on robust monitoring to ensure that their databases are performant, and cause no service disruptions to their customers. Amazon Managed Grafana is a fully managed and secure data visualization service that you […]

How to Automate Incident Response with PagerDuty and AWS Systems Manager Incident Manager

Incident response is a core operations capability for organizations to develop, and a core element in the AWS Cloud Adoption Framework (AWS CAF). Responding to operations incidents quickly is important to minimize their impacts. Automating incident response helps you scale your capabilities, rapidly reduce the recovery time, and reduce repetitive work by your cloud operations teams. […]

Build Cloud Operations Skills Using the New Getting Started with AWS Audit Manager Training

Are you responsible for your organization’s compliance? Do you want to simplify and automate audit activities? Do you want to make sure your organization is compliant with internal control frameworks and industry standards? If you need to simplify your risk and compliance assessments while automating evidence collection in your AWS cloud environment, then getting started […]