Amazon Web Services
In this AWS re:Invent 2023 session, Stripe shares insights on architecting observability at massive scale. Hasan Tariq from AWS and Cody Rioux from Stripe discuss the challenges of managing observability as businesses grow, including infrastructure complexity, data volume, and cost. They present five key architectural changes to improve scalability, reliability, and cost-effectiveness: sharding, aggregation, tiered storage, streaming alerts, and isolation. The speakers emphasize the importance of creating a culture of self-reliance and making it easy for users to implement effective observability practices. Stripe's approach leverages AWS services like Amazon Managed Service for Prometheus and Amazon Managed Grafana to handle their massive scale of half a billion metrics every 10 seconds across 3000 engineers.