AWS Machine Learning Blog
Category: Amazon Managed Service for Prometheus
Open source observability for AWS Inferentia nodes within Amazon EKS clusters
This post walks you through the Open Source Observability pattern for AWS Inferentia, which shows you how to monitor the performance of ML chips, used in an Amazon Elastic Kubernetes Service (Amazon EKS) cluster, with data plane nodes based on Amazon Elastic Compute Cloud (Amazon EC2) instances of type Inf1 and Inf2.