How can I access Spark driver logs on an Amazon EMR cluster?

Last updated: 2020-04-13

I need to troubleshoot an Apache Spark application. How do I access Spark driver logs on an Amazon EMR cluster?

Short Description

On Amazon EMR, Spark runs as a YARN application and supports two deployment modes:

  • Client mode: The default deployment mode. In client mode, the Spark driver runs on the host where the spark-submit command is executed.
  • Cluster mode: The Spark driver runs in the application master. The application master is the first container that runs when the Spark job executes.

Resolution

Client mode jobs

When you submit a Spark application by running spark-submit with --deploy-mode client on the master node, the driver logs are displayed in the terminal window. Amazon EMR doesn't archive these logs by default. To capture the logs, save the output of the spark-submit command to a file. Example:

$ spark-submit [--deploy-mode client] ... 1>output.log 2>error.log

When you submit a Spark application using an Amazon EMR step, the driver logs are archived to the stderr.gz file on Amazon Simple Storage Service (Amazon S3). The file path looks like this:

s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/steps/s-2M809TD67U2IA/stderr.gz

For more information, see View Log Files.

Download the step logs to an Amazon Elastic Compute Cloud (Amazon EC2) instance and then search for warnings and errors:

1.    Download the step logs:

aws s3 sync s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/steps/s-2M809TD67U2IA/ s-2M809TD67U2IA/

2.    Open the step log folder:

cd s-2M809TD67U2IA/

3.    Uncompress the log file:

find . -type f -exec gunzip {} \;

4.    Get the YARN application id from the cluster mode log:

grep "Client: Application report for" * | tail -n 1

5.    Find errors and warnings in the client mode log:

egrep "WARN|ERROR" *

It's also possible to submit a Spark application using an application such as JupyterHub, Apache Livy, or Apache Zeppelin. These applications become the client that submits the Spark application to the cluster. In this scenario, driver logs are stored in the corresponding application's logs, in the /mnt/var/log/ folder on the master node. You can also find the compressed logs at the following Amazon S3 path:

s3://awsexamplebucket/JOBFLOW_ID/node/MASTER_ID/applications/

For example, if you're using Zeppelin, you can find the Spark driver logs in /mnt/var/log/zeppelin/zeppelin-interpreter-spark-xxxxxxxxxx.log.

Note: For Jupyter, the driver logs are stored in the Livy logs: /mnt/var/log/livy/livy-livy-server.out.

For more information on accessing application-specific logs, see View Log Files.

Cluster mode jobs

When you submit the Spark application in cluster mode, the driver process runs in the application master container. The application master is the first container that runs when the Spark application executes. The client logs the YARN application report. To get the driver logs:

1.    Get the application ID from the client logs. In the following example, application_1572839353552_0008 is the application ID.

19/11/04 05:24:42 INFO Client: Application report for application_1572839353552_0008 (state: ACCEPTED)

2.    Identify the application master container logs. The following is an example list of Spark application logs. In this list, container_1572839353552_0008_01_000001 is the first container, which means that it's the application master container.

s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/containers/application_1572839353552_0008/container_1572839353552_0008_01_000001/stderr.gz

s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/containers/application_1572839353552_0008/container_1572839353552_0008_01_000001/stdout.gz

s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/containers/application_1572839353552_0008/container_1572839353552_0008_01_000002/stderr.gz

s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/containers/application_1572839353552_0008/container_1572839353552_0008_01_000002/stdout.gz

3.    Download the application master container logs to an EC2 instance:

aws s3 sync s3://aws-logs-111111111111-us-east-1/elasticmapreduce/j-35PUYZBQVIJNM/containers/application_1572839353552_0008/ application_1572839353552_0008/

4.    Open the Spark application log folder:

cd application_1572839353552_0008/

5.    Uncompress the log file:

find . -type f -exec gunzip {} \;

6.    Search all container logs for errors and warnings:

egrep -Ril "ERROR|WARN" . | xargs egrep "WARN|ERROR"

7.    Open the container logs that are returned in the output of the previous command.

On a running cluster, you can use the YARN CLI to get the YARN application container logs. For a Spark application submitted in cluster mode, you can access the Spark driver logs by pulling the application master container logs like this:

# 1. Get the address of the node that the application master container ran on
$ yarn logs -applicationId application_1585844683621_0001 | grep  'Container: container_1585844683621_0001_01_000001'

20/04/02 19:15:09 INFO client.RMProxy: Connecting to ResourceManager at ip-xxx-xx-xx-xx.us-west-2.compute.internal/xxx.xx.xx.xx:8032
Container: container_1585844683621_0001_01_000001 on ip-xxx-xx-xx-xx.us-west-2.compute.internal_8041

# 2. Use the node address to pull the container logs
$ yarn logs -applicationId application_1585844683621_0001 -containerId container_1585844683621_0001_01_000001 -nodeAddress ip-xxx-xx-xx-xx.us-west-2.compute.internal

Did this article help you?

Anything we could improve?


Need more help?