AWS Cloud Operations & Migrations Blog

How StormForge reduces complexity and ensures scalability with Amazon Managed Service for Prometheus

This blog post was co-written by Brent Eager, Senior Software Engineer, StormForge StormForge is the creator of Optimize Live, a Kubernetes vertical rightsizing solution that is compatible with the Kubernetes HorizontalPodAutoscaler (HPA). Using cluster-based agents, machine learning, and Amazon Managed Service for Prometheus, Optimize Live is able to continuously calculate and apply optimal resource requests, […]

Create a data-driven Migration Business Case using AWS Cloud Value Framework

AWS customers realize more than a 5:1 ratio of benefits to investment costs over five years with breakeven on their investment occurring in an average of 10 months (source: “The Business Value of Amazon Web Services”, an IDC whitepaper). This blog aims to help Information Technology (IT) teams calculate this value using the tools needed […]

Using the unified CloudWatch Agent to send traces to AWS X-Ray

Today, applications are more distributed than ever before and they no longer run in isolation. This is especially the case when utilizing Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). A distributed workload or system is one that encompasses multiple small independent components, all working together to complete a task or job. […]

Unlock Faster Releases with AWS AppConfig: The Secret Weapon for Your CI/CD Strategy

Striking a Balance Between Reliability and Agility in Cloud Operations The IT operation team of an enterprise serves as the first line of defense against potential business disruptions. They operate 24/7, acts as a hub, continuously monitor and manage the IT environment. The operation team handles and prioritizes critical IT incidents to minimize downtime and […]

How to monitor AWS WAF logging centrally using Amazon Managed Grafana

It is important for cloud security operations teams to maintain a high level of cloud security and detect and respond to malicious web activity in near real-time. AWS WAF helps protect web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources. However, as your cloud environment scales with […]

Distribute an Amazon Machine Image to another AWS Account using AWS Application Migration Service Post-launch automation

Many customers migrating their workloads to AWS using AWS Application Migration Service want to use different AWS accounts to support their company’s governance and security needs. Customers may also choose to use Infrastructure As Code (IaC) templates using AWS CloudFormation or Terraform with Application Migration Service to deploy source servers to different AWS Accounts. To […]

Unlocking Insights: Turning Application Logs into Actionable Metrics

Modern software development teams understand the importance of observability as a critical aspect of building reliable and resilient applications. By implementing observability practices, software teams can proactively identify issues, uncover performance bottlenecks, and enhance system reliability. However, it is a fairly recent trend and still lacks industry-wide adoption. As organizations standardize on containers, they often […]

Track your workload’s risks with the new AWS Well-Architected Tool Connector for Jira

The AWS Well-Architected Framework is a collection of best practices that helps customers build and operate secure, high-performing, resilient, and cost-effective workloads on the AWS Cloud. With the AWS Well-Architected Tool (AWS WA Tool), you can review the state of your applications and workloads against architectural best practices, identify opportunities for improvement, and track progress […]

Autoscaling Kubernetes workloads with KEDA using Amazon Managed Service for Prometheus metrics

Introduction With the rising popularity of applications hosted on Amazon Elastic Kubernetes Service (Amazon EKS), a key challenge is handling increases in traffic and load efficiently. Traditionally, you would have to manually scale out your applications by adding more instances – an approach that’s time-consuming, inefficient, and prone to over or under provisioning. A better […]

Automate incident reports from AWS Systems Manager Incident Manager

An effective incident management is foremost for maintaining system reliability and ensuring quick responses to unexpected incidents. Incident Manager, a capability of AWS Systems Manager, helps to mitigate and recover from these incidents by enabling automated responses. In a previous blog with Incident Manager, we talked about setting up escalation mechanisms, creating response plans and […]