AWS Cloud Operations Blog

Tag: Resilience

Introducing AWS Fault Injection Service Actions to Inject Chaos in Lambda functions

Usage of serverless technology in regulated industries like financial services is growing. This growth demands robust resilience validation. Chaos engineering for Serverless has become crucial for ensuring reliable and available serverless applications. By purposefully injecting failures and stresses into serverless components, teams can uncover hidden weaknesses and validate the fault tolerance of their systems. Previously, […]

Strengthen application resilience with myApplications and AWS Resilience Hub

Introduction Today, organizations prioritize managing their applications over infrastructure, focusing on business outcomes while leveraging automation and cloud services to handle the underlying infrastructure. They seek to consolidate key application metrics like health, security, cost, and performance from AWS services such as AWS Security Hub or Amazon CloudWatch. These organizations also need to ensure their […]

Bootstrap your chaos engineering journey with AWS Fault Injection Service Scenarios Library

Ensuring the reliability and resilience of applications is crucial for maintaining business continuity, delivering a superior customer experience, and staying compliant with industry regulations. As defined in the AWS Well-Architected Framework Reliability Pillar, testing reliability plays an important role in ensuring reliability. Chaos engineering is a powerful way to not only test how your systems […]

featured image

How to perform Failover and Failback using AWS Elastic Disaster Recovery (AWS DRS) between VMware and AWS environments

Enterprises face a variety of threats such as natural disasters, cyber-attacks and technology failures that could severely disrupt operations. A comprehensive disaster recovery plan is crucial to quickly respond and recover from these events. In this blog post, we’ll show how to plan and implement a comprehensive disaster recovery solution between your VMware on-premises environment […]

Using Permissions to Unlock Resilience with AWS Resilience Hub

AWS customers come to AWS Resilience Hub for the ability to assess their application against their Recovery Time Objectives (RTO), the maximum acceptable time an application can be in a disrupted state, and Recovery Point Objectives (RPO), the maximum amount of data that can be lost due to disruption. Although customers come for the assessment […]

Resiliency Journey : exploring how AWS Resilience Hub and Migration Acceleration Program come together

In today’s rapidly evolving digital landscape, the cloud has become the backbone of innovation, scalability, and efficiency for businesses worldwide. As customers embark on their cloud migration journeys, whether the migration has been motivated by the intention of accelerating innovation, reducing operational and infrastructure costs, or exiting your on-prem datacenter, migrating to the cloud presents […]

Leverage AWS Resilience Lifecycle Framework to assess and improve the resilience of application using AWS Resilience Hub

As more customers advance in their cloud adoption journey, they recognize that simply migrating applications to the cloud does not automatically ensure resilience. To ensure resilience, applications need to be designed to withstand disruptions from infrastructure, dependent services, misconfiguration and intermittent network connectivity issues. While many organizations understand the importance of building resilient applications, some […]

Using the Fault Tolerance Analyser Tool to Identify Potential Issues

Introduction Ensuring resilience, the ability for a system to recover from a failure induced by load, attacks, and other issues, is a shared responsibility that underpins the reliability of your workloads. While AWS provides the resilient underlying cloud infrastructure, customers are tasked with maintaining the resilience of their applications. In this landscape of joint responsibility, […]

Validating and Improving the RTO and RPO Using AWS Resilience Hub

“Everything fails, all the time”, a famous quote from Werner Vogels, VP and CTO of Amazon.com. When you design and build an application, a typical goal is to have it working, the next is to keep it running, no matter what disruptions may occur. It is crucial to achieve resiliency, but you need to consider […]