Amazon Web Services

In this AWS re:Invent 2023 session, three AWS operational leaders share valuable insights on building resilient systems at scale. They discuss five key topics: dependencies and modes, blast radius, queues, errors, and retries. The speakers emphasize the importance of thinking beyond traditional availability metrics and focus on shortening time to mitigation when unexpected issues occur. They provide real-world examples from AWS services like Route 53 and EC2, demonstrating how seemingly small changes can have significant impacts at scale. The session offers practical advice on implementing resilience strategies, including proper error classification, thoughtful retry mechanisms, and effective queue management. This talk is essential for anyone looking to improve their system's ability to recover quickly from failures in large-scale environments.

cloud-trends-and-knowledge
skills-and-how-to
resilience
arch-strategy
mgmt-govern
Show 4 more

Up Next

VideoThumbnail
29:37

Builders 온라인 시리즈 | Amazon CloudWatch로 모니터링 손쉽게 시작하기

Jun 27, 2025
VideoThumbnail
26:19

Builders 온라인 시리즈 | AWS 파트너와 클라우드 여정 함께하기

Jun 27, 2025
VideoThumbnail
38:14

Builders 온라인 시리즈 | AWS re:Invent recap - 2024년 AWS가 선보이는 혁신적인 클라우드 서비스

Jun 27, 2025
VideoThumbnail
27:19

Builders 온라인 시리즈 | 클릭 몇 번으로 Amazon RDS 손쉽게 구성하기

Jun 27, 2025
VideoThumbnail
35:29

Splunk로 AWS에서 운영 인텔리전스와 인사이트 확보하기 - AWS TechCamp

Jun 26, 2025