How can I view the Apache Tez web user interface on a terminated Amazon EMR cluster?

You can't access the Tez web UI after a cluster is terminated. However, you can move the files to another cluster that you don't plan to terminate and then view the UI on that cluster.

Note: This resolution applies to Amazon EMR 5.x release versions.

Note: This method can't be used for viewing logs older than the value set in yarn.timeline-service.ttl-ms. By default, this property is set to 604800000 milliseconds (seven days).

Step 1: Collect logs from source cluster before termination

YARN timeline logs use the LevelDB storage library. They're stored in the following path on master node of the Amazon EMR cluster. This path is the value set by the yarn.timeline-service.leveldb-timeline-store.path property in the yarn-site.xml file on the master node. For more information, see Configuring Tez.

/mnt/var/lib/hadoop/tmp/yarn/timeline/leveldb-timeline-store.ldb/

1.    Run a command similar to the following to list the files:

$ sudo ls -al /mnt/var/lib/hadoop/tmp/yarn/timeline/leveldb-timeline-store.ldb/
total 1032

drwx------ 2 yarn yarn     80 Dec  9 10:51 .
drwxr-xr-x 3 yarn yarn     39 Dec  9 10:51 ..
-rw-r--r-- 1 yarn yarn 983040 Dec  9 11:28 000003.log
-rw-r--r-- 1 yarn yarn     16 Dec  9 11:28 CURRENT
-rw-r--r-- 1 yarn yarn      0 Dec  9 11:28 LOCK
-rw-r--r-- 1 yarn yarn     57 Dec  9 11:28 LOG
-rw-r--r-- 1 yarn yarn  65536 Dec  9 11:28 MANIFEST-000002

2.     Zip the files from the previous step, as shown in the following example:

$ sudo zip -r emrleveldb.zip /mnt/var/lib/hadoop/tmp/yarn/timeline/leveldb-timeline-store.ldb/

3.    Upload the zipped file to an Amazon Simple Storage Service (Amazon S3) bucket. Example:

$ aws s3 cp emrleveldb.zip s3://<bucket-name>/path/to/upload/

Step 2: Deploy logs to a new Amazon EMR cluster

1.    Stop hadoop-yarn-timelineserver, as shown in the following example. For more information, see How do I restart a service in Amazon EMR?

$ sudo initctl stop hadoop-yarn-timelineserver

2.    Download the zipped file from S3 to a new Amazon EMR cluster, as shown in the following example:

$ aws s3 cp s3://<bucket-name>/path/to/upload/emrleveldb.zip .

3.    Run commands similar to the following to deploy the logs from the first cluster on a different cluster that you don't plan to terminate. You'll use the second cluster to view the archived Tez web UI after you terminate the first cluster.  

$ unzip emrleveldb.zip
$ sudo cp -r mnt/var/lib/hadoop/tmp/yarn/timeline/leveldb-timeline-store.ldb/* /mnt/var/lib/hadoop/tmp/yarn/timeline/leveldb-timeline-store.ldb/

4.    Start hadoop-yarn-timelineserver, as shown in the following example.

$ sudo initctl start hadoop-yarn-timelineserver

5.    Verify that you can view the Tez web UI from the first cluster on the second cluster.  

Troubleshooting deleted logs

By default, logs older than seven days are deleted when you start the YARN Timeline Service on the second cluster. When this happens, the logs for yarn-timelineserver contain entries like this:  

Starting deletion thread with ttl 604800000 and cycle interval 300000

To resolve this problem, modify the value for yarn.timeline-service.ttl-ms in the second cluster's yarn-site.xml file as appropriate for your use case. In the following example, the retention period is set to 1209600000 milliseconds (14 days).

[
  {
    "Classification": "yarn-site",
    "Properties": {
      "yarn.timeline-service.ttl-ms": "1209600000"
    }
  }
]

Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2018-12-10