AWS Data Pipeline Monitoring and Debugging Features

Posted on: Aug 14, 2014

We are pleased to announce several new monitoring and debugging features for AWS Data Pipeline, Amazon's managed ETL service.

First, we introduced a new concept called 'health status' that makes it easier to monitor your pipelines. A component is HEALTHY if it finished successfully the last time it ran; if it failed or timed out, its health status is ERROR. If every component in a pipeline is HEALTHY the pipeline itself is HEALTHY; otherwise the pipeline's health status is ERROR. You can view health status in the console, CLI, and APIs. To learn more about health status, click here.

Second, we made it easier to access logs for debugging. You can now specify a single S3 root directory to hold all logs related to the pipeline, including individual activity attempt logs. You can view these logs in the Console on the new Execution Details page or access them in S3 based on a new standardized directory structure. To learn more about logging, click here.

Third, we made it easier to view dependencies. In the Console you can now see what components your pipeline is waiting on and quickly drill down to the root cause of a delay or failure.

We hope you find these features useful. Please submit feedback in the console to let us know what you think.