How can I prevent a Hadoop or Spark job's user cache from consuming too much disk space in Amazon EMR?

Last updated: 2022-01-31

The user cache for my Apache Hadoop or Apache Spark job is taking up all the disk space on the partition. The Amazon EMR job is failing or the HDFS NameNode service is in safe mode.

Short description

On an Amazon EMR cluster, YARN is configured to allow jobs to write cache data to /mnt/yarn/usercache. When you process a large amount of data or run multiple concurrent jobs, the /mnt file system can fill up. This causes node manager failures on some nodes, which then causes the job to freeze or fail.

Use one of the following methods to resolve this problem:

  • Adjust the user cache retention settings for YARN NodeManager. Choose this option if you don't have long-running jobs or streaming jobs.
  • Scale up the Amazon Elastic Block Store (Amazon EBS) volumes. Choose this option if you have long-running jobs or streaming jobs.


Option 1: Adjust the user cache retention settings for NodeManager

The following attributes define the cache cleanup settings:

  • yarn.nodemanager.localizer.cache.cleanup.interval-ms: This is the cache cleanup interval. The default value is 600,000 milliseconds. After this interval—and if the cache size exceeds the value set in—files that aren't in use by running containers are deleted.
  • This is the maximum disk space allowed for the cache. The default value is 10,240 MB. When the cache disk size exceeds this value, files that aren't in use by running containers are deleted on the interval set in yarn.nodemanager.localizer.cache.cleanup.interval-ms.

To set the cleanup interval and maximum disk space size on a running cluster:

1.    Open /etc/hadoop/conf/yarn-site.xml on each core and task node, and then reduce the values for yarn.nodemanager.localizer.cache.cleanup.interval and For example:

sudo vim /etc/hadoop/conf/yarn-site.xml

yarn.nodemanager.localizer.cache.cleanup.interval-ms 400000 5120

2.    To restart the NodeManager service, run the following commands on each core and task node:

sudo stop hadoop-yarn-nodemanager
sudo start hadoop-yarn-nodemanager

Note: In Amazon EMR release versions 5.21.0 and later, you can also use a configuration object, similar to the following, to override the cluster configuration or specify additional configuration classifications for a running cluster. For more information, see Reconfigure an instance group in a running cluster.

To set the cleanup interval and maximum disk space size on a new cluster, add a configuration object similar to the following when you launch the cluster:

      "Classification": "yarn-site",
     "Properties": {
       "yarn.nodemanager.localizer.cache.cleanup.interval-ms": "400000",
       "": "5120"

Remember that the deletion service doesn't complete on running containers. This means that even after you adjust the user cache retention settings, data might still be spilling to the following path and filling up the file system:

{'yarn.nodemanager.local-dirs'}/usercache/user/appcache/application_id ,

Option 2: Scale up the EBS volumes on the EMR cluster nodes

To scale up storage on a running EMR cluster, see Dynamically scale up storage on Amazon EMR clusters.

To scale up storage on a new EMR cluster, specify a larger volume size when you create the EMR cluster. You can also do this when you add nodes to an existing cluster:

  • Amazon EMR release version 5.22.0 and later: The default amount of EBS storage increases based on the size of the Amazon Elastic Compute Cloud (Amazon EC2) instance. For more information about the amount of storage and number of volumes allocated by default for each instance type, see Default Amazon EBS storage for instances.
  • Amazon EMR release versions 5.21 and earlier: The default EBS volume size is 32 GB. Of this amount, 27 GB is reserved for the /mnt partition. HDFS, YARN, the user cache, and all applications use the /mnt partition. Increase the size of your EBS volume as needed (for example, 100-500 GB or more). You can also specify multiple EBS volumes. Multiple EBS volumes will be mounted as /mnt1, /mnt2, and so on.

For Spark streaming jobs, you can also perform an unpersist ( RDD.unpersist()) when processing is done and the data is no longer needed. Or, explicitly call System.gc() in Scala ( sc._jvm.System.gc() in Python) to start JVM garbage collection and remove the intermediate shuffle files.

Did this article help?

Do you need billing or technical support?