Filter
Filter
Filter
Total results: 33
- Featured
-
Software Delivery & Operations
Resilience lessons from the lunch rush
Author: Mike HakenStrategies on how to make systems more resilient and control excessive load
-
Software Delivery and Operations
Level 300My CI/CD pipeline is my release captain
Author: Clare LiguoriLearn how Amazon continuously releases changes to production safely using practices such as trunk-based development, immutable deployment artifacts, and proactive rollbacks.
-
Software Delivery & Operations
LEVEL 300Using dependency isolation to contain concurrency overload
Author: David YanacekContaining the impact caused by a failing dependency to affect only the relevant functionality in an application.
-
Architecture
Level 300Minimizing correlated failures in distributed systems
Author: Joe MagerramovContinue operating even if some of those servers fail, while using relatively inexpensive, commodity servers.
-
Architecture
Level 300Reliability, constant work, and a good cup of coffee
Author: Colm MacCarthaighSimplifying systems to deliver stability by avoiding scaling during times of stress. -
Architecture
Level 300Making retries safe with idempotent APIs
Author: Malcolm FeatonbyStrategies for using idempotent APIs to reduce complexity and manage retries Correspondence -
Software Delivery & Operations
200Hands-off: Automating continuous delivery pipelines at Amazon
Author: Clare LiguoriIn this session, learn about Amazon’s automated approach to continuous delivery that helps release code safely and quickly, with pipelines that enable developers to focus on building solutions rather than managing deployments.
-
Software Delivery & Operations
200Amazon's approach to production services monitoring
Author: David YanacekThis session covers the full spectrum of monitoring at Amazon, from how teams assess system health at a high level to how they zoom in to understand the details of a single request. Also, learn how Amazon thinks about percentiles, dimensionality of metrics, dashboards, log analysis, and distributed tracing.
-
Architecture
Level 400Fairness in multi-tenant systems
Author: David YanacekBuilding fairness into multitenant systems to provide predictable performance and availability -
Architecture
Level 300Avoiding overload in distributed systems by putting the smaller service in control
Author: Joe MagerramovStrategies for avoiding the larger service from overloading the smaller one by putting the smaller service in control of the pace of interactions. -
Software Delivery and Operations
Level 300Building dashboards for operational visibility
Author: John O'SheaBuilding dashboards to monitor, dive deep, audit, and review distributed services and automated systems. -
Software Delivery and Operations
Level 300Automating safe, hands-off deployments
Author: Clare LiguoriStrategies for continuously deploying to production while balancing safety and speed. -
Software Delivery and Operations
-
Architecture
Level 400Architecting and operating resilient serverless systems at scale
Author: David YanacekIn this video, we cover what AWS does to build reliable and resilient services, including avoiding modes and overload, performing bounded work, throttling at multiple layers, guarding concurrency, sending idempotent requests, applying backpressure and fairness in queueing, and performing shuffle sharding. -
Software Delivery and Operations
Level 400Amazon's approach to high-availability deployment
Author: Peter RamenskyIn this video, learn the continuous-delivery practices that we invented that help raise the bar and prevent costly deployment failures. -
Architecture
Level 300Amazon’s approach to security during development
Author: Colm MacCarthaighIn this video, learn about how AWS teams both minimize security risks in our products and respond to security issues proactively. -
Software Delivery and Operations
-
Architecture
Level 400Beyond five 9s: Lessons from our highest available data planes
Author: Colm MacCarthaighIn this video, hear lessons from how AWS has built and architected Amazon Route 53 and the AWS authentication system, designed to survive cataclysmic failures, enormous load increases, and more.