Why am I not able to see the Amazon CloudWatch metrics for my AWS Glue ETL job even after I enabled job metrics?

Last updated: 2021-08-20

I've enabled the option to create job metrics for my AWS Glue extract, transform, and load (ETL) job. However, I can't see the job metrics in Amazon CloudWatch.

Short description

AWS Glue sends metrics to CloudWatch every 30 seconds, and the CloudWatch console dashboard is configured to display these metrics every minute. The AWS Glue metrics represent delta values from previously reported values. The metrics dashboard aggregates the 30-second values to obtain a value for the last minute. The job metrics for your job are enabled with the initialization of a GlueContext in the job script. The metrics are updated only at the end of an Apache Spark task. The job metrics represent the aggregate values across all completed Spark tasks.

Resolution

Increase the run time of your AWS Glue job: The CloudWatch metrics are reported every 30 seconds. Therefore, if the run time of your job is less than 30 seconds, then the job metrics aren't sent to CloudWatch. AWS Glue uses the metrics from Spark, and Spark uses the DropWizard metrics library for publishing metrics. To get the AWS Glue metrics, your job must run for at least 30 seconds. Updating your job to process more data can help increase the run time of your job. However, you can use a temporary workaround to see the job metrics. You can increase the run time of your AWS Glue job by including the function time.sleep() in your job. You can include time.sleep() in your job at the start or end of the code based on your use case.

Important: Using the time.sleep() function is not a coding best practice.

For Python:

import time
time.sleep(30)

For Scala:

Thread.sleep(30)

Be sure that the job completed the Spark tasks: Job metrics are reported after the Spark tasks are complete. Therefore, check and confirm that the Spark tasks for your job are completed, and the job did not fail.

Be sure that GlueContext is initialized in the job script: The GlueContext class in your job script enables writing metrics into CloudWatch. If you're using a custom script that uses only a DataFrame and not a DynamicFrame, the GlueContext class might not be initialized. This might result in the metrics not getting written to CloudWatch. If you're using a custom script, be sure to update your job to initialize the GlueContext class.

Be sure that the AWS Glue IAM role has the required permission: Check and confirm that the IAM role attached to the ETL job has the cloudwatch:PutMetricData permission to create metrics in CloudWatch. If you're using a custom role, then be sure that the role has the permission to write the job metrics into CloudWatch.
Note: It's a best practice to use the AWS managed policy AWSGlueServiceRole to manage permissions.


Did this article help?


Do you need billing or technical support?