Why does the YARN application still use resources after the Spark job that I ran on Amazon EMR is completed?
Last updated: 2021-06-24
I'm running a Jupyter or Zeppelin notebook on my Amazon EMR cluster. The YARN application continues to run even after the Apache Spark job that I submitted is completed.
When you run a Spark notebook in Zeppelin or Jupyter, Spark starts an interpreter. The interpreter creates a YARN application. This application is the Spark driver that shows up when you list applications. The driver doesn't terminate when you finish running a job from the notebook. By design, the Spark driver stays active so that it can request application containers for on-the-fly code runs. The downside is that the YARN application might be using resources that other jobs need. To resolve this issue, you can manually stop the YARN application. Alternatively, you can set a timeout value that automatically stops the application.
Option 1: Restart the Spark interpreter
Before you begin, be sure that you have permissions to restart the interpreter in Zeppelin.
1. Open Zeppelin.
2. From the dropdown list next to the user name, choose Interpreter.
3. Find the Spark interpreter, and then choose restart. Zeppelin terminates the YARN job when the interpreter restarts.
Option 2: Stop the YARN job manually
Before you begin, be sure of the following:
- You have SSH access to the Amazon EMR cluster.
- You have the permission to run YARN commands.
Use the -kill command to terminate the application. In the following example, replace application_id with your application ID.
yarn application -kill application_id
Option 3: Set an interpreter timeout value
Zeppelin versions 0.8.0 and later (available in Amazon EMR versions 5.18.0 and later) include a lifecycle manager for interpreters. Use the TimeoutLifecycleManager setting to terminate interpreters after a specified idle timeout period:
1. Create a etc/zeppelin/conf/zeppelin-site.xml file with the following content. In this example, the timeout period is set to 120,000 milliseconds (2 minutes). Choose a timeout value that's appropriate for your environment.
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>zeppelin.interpreter.lifecyclemanager.class</name> <value>org.apache.zeppelin.interpreter.lifecycle.TimeoutLifecycleManager</value> <description>This is the LifecycleManager class for managing the lifecycle of interpreters. The interpreter terminates after the idle timeout period.</description> </property> <property> <name>zeppelin.interpreter.lifecyclemanager.timeout.checkinterval</name> <value>60000</value> <description>The interval for checking whether the interpreter has timed out, in milliseconds.</description> </property> <property> <name>zeppelin.interpreter.lifecyclemanager.timeout.threshold</name> <value>120000</value> <description>The idle timeout limit, in milliseconds.</description> </property> </configuration>
2. Run the following commands to restart Zeppelin:
$ sudo stop zeppelin
$ sudo start zeppelin
Option 1: Manually shut down the notebook
After the job is completed, use one of the following methods to stop the kernel in the Jupyter user interface:
- In the Jupyter notebook interface, open the File menu, and then choose Close and Halt.
- On the Jupyter dashboard, open the Running tab. Choose Shutdown for the notebook that you want to stop.
Option 2: Manually shut down the kernel
From the Jupyter notebook interface, open the Kernel menu, and then choose Shutdown.
Option 3: Configure the timeout attribute
If you close the notebook tab or browser window before shutting down the kernel, the YARN job continues to run. To prevent this from happening, configure the NotebookApp.shutdown_no_activity_timeout attribute. This attribute terminates the YARN job after a specified idle timeout period, even if you close the tab or browser window.
Do the following to configure the NotebookApp.shutdown_no_activity_timeout attribute:
1. Open the /etc/jupyter/jupyter_notebook_config.py file on the master node, and then add an entry similar to the following. In this example, the timeout attribute is set to 120 seconds. Choose a timeout value that's appropriate for your environment.
c.NotebookApp.shutdown_no_activity_timeout = 120
2. Run the following commands to restart jupyterhub:
sudo docker stop jupyterhub
sudo docker start jupyterhub