How do I resolve "OutOfMemoryError" Hive Java heap space exceptions on Amazon EMR that occur when Hive outputs the query results?

8 minute read
0

I'm running an Apache Hive query on Amazon EMR. Hive throws an OutOfMemoryError exception while outputting the query results.

Short description

The OutOfMemoryError exception usually happens during INSERT OVERWRITE commands when there's not enough heap space on hive-server2, the Hive metastore, or the client side. To resolve this issue, increase the maximum memory allocation for the JVM or increase HADOOP_HEAPSIZE.

Resolution

Use one or more of the following solutions to resolve OutOfMemoryError exceptions.

Note: These solutions don't cover OutOfMemoryError exceptions that occur during Apache Tez container memory tuning.

Increase the maximum memory allocation for the JVM

When you launch a Hive shell, 1 GB of memory is allocated by default. The maximum memory allocation is defined by the -Xmx parameter. If your process attempts to use more than the maximum value, Hive kills the process and throws the OutOfMemoryError exception. To resolve this issue, increase the -Xmx value in the Hive shell script (in MB), and then run your Hive query again.

Identify OutOfMemoryError in logs

Check for the following error in your log location (For example: /mnt/var/log/hive/user/hadoop/hive.log):

# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
#   Executing /bin/sh -c "kill -9 12345"...
Killed

If you see this error message, the JVM heap space is running out of memory. Increase HADOOP_HEAPSIZE for the Hive CLI service in /etc/hive/conf/hive-env.sh, as appears in the following example. You can also increase the HADOOP_HEAPSIZE using the Amazon EMR reconfiguration API for hive-env classification. The default value is 1000. Increase it as appropriate for your use case. Then, run the Hive query again.

export HADOOP_HEAPSIZE=2048

Important: This setting applies to hive-server2, the Hive metastore, and the Hive CLI after these services are restarted. Optionally, you can set separate values for each of these services.

Updating memory allocation when creating a cluster

Amazon EMR provides the configuration API operation to update the default configuration when you update the cluster using the configuration object. Update the HADOOP_HEAPSIZE value based on your use case:

{
  "Classification": "hive-env",
  "Properties": {},
  "Configurations": [
    {
      "Classification": "export",
      "Properties": {
        "HADOOP_HEAPSIZE": "2048"
      },
      "Configurations": []
    }
  ]
}

Updating memory allocation on a running cluster

You can increase the value of HADOOP_HEAPSIZE on a running cluster by either connecting to the master mode using SSH or Amazon EMR reconfiguration API. The Amazon EMR reconfiguration API is available only for Amazon EMR releases 5.21 and later. If the memory updates are made by SSH into the cluster, you must restart the hive services for the changes to take effect.

Beeline or SQL Workbench/J

1.    If you're running the same query from Beeline or SQL Workbench/J, check /mnt/var/log/hive/hive-server2.log and hive-server2.out for heap space garbage collection errors. For example:

Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://emr-analytics_master.abc.aws.> select id, name, x.* from mydb.location a, curated_admin.nxpepnd1_tpn_prvdr_pra_fclt b, curated_admin.test_table c where a.test = b._id and a._id = b._id and b._prod_id = c.prod_f_id;
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
#   Executing /bin/sh -c "kill -9 27745"...
Killed

2.    If you find errors similar to this, then increase HADOOP_HEAPSIZE for hive-server2 in /etc/hive/conf/hive-env.sh. This setting also applies to the Hive metastore and the Hive client.

export HADOOP_HEAPSIZE=2048

Optionally, use conditional statements to specify different heap sizes for hive-server2, the metastore, and the client. For example:

export HIVE_CLIENT_HEAPSIZE=1024
export HIVE_METASTORE_HEAPSIZE=2048
export HIVE_SERVER2_HEAPSIZE=3072 
if [ "$SERVICE" = "metastore" ]
then
export HADOOP_HEAPSIZE=$HIVE_METASTORE_HEAPSIZE
elif [ "$SERVICE" = "hiveserver2" ]
then
export HADOOP_HEAPSIZE=$HIVE_SERVER2_HEAPSIZE
export HADOOP_OPTS="$HADOOP_OPTS -server -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/hive"
else
export HADOOP_HEAPSIZE=$HIVE_CLIENT_HEAPSIZE
fi

3.    Run the Hive query again after you update these settings. If you still get the OutOfMemoryError exception and you're running multiple clients at the same time, continue to step 4. If you still get the OutOfMemoryError exception and you aren't running multiple clients at the same time, skip to step 8.

4.    Increase the -Xmx parameters for each client, as appropriate for your use case.

5.    Choose the garbage collector that's appropriate for your use case by adding -XX:+UseParNewGC (new parallel garbage collector) or -XX:+UseConcMarkSweepGC (concurrent mark sweep garbage collector) in the HADOOP_OPTS lines, as shown in the following example. For more information about choosing a garbage collector, see Garbage collection and Java HotSpot VM Options in the Java documentation.

export HADOOP_HEAPSIZE=2048
if [ "$SERVICE" = "cli" ]; then
if [ -z "$DEBUG" ]; then
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xmx12288m -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:+useParNewGC -XX:-useGCOverheadLimit"
else
export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xmx12288m -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-useGCOverheadLimit"
fi
fi

6.    Hive throws the OutOfMemoryError exception if garbage collection doesn't succeed in a specified amount of time. To remove the time limit, remove -XX:-UseGCOverheadLimit, or replace it with -XX:+UseGCOverheadLimit. Optionally, modify -XX:-UseGCOverheadLimit to specify a new time limit for garbage collection. For more information, see The Parallel Collector in the Java documentation.

7.    Run the Hive query again. If Hive throws a heap space error on the terminal during runtime—and if there are no errors in hive.log or hive-server2.log—then your Hive client is probably running out of memory. For example:

Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://emr-analytics_master.abc.aws.> select id, name, x.* from mydb.location a, curated_admin.nxpepnd1_tpn_prvdr_pra_fclt b, curated_admin.test_table c where a.test = b._id and a._id = b._id and b._prod_id = c.prod_f_id;
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
#   Executing /bin/sh -c "kill -9 27745"...
Killed

8.    To resolve this error, increase the client memory, and then run the query again.

9.    If Hive still throws the OutOfMemoryError exception, then complete the following steps for your client.

Beeline:

By default, Beeline attempts to buffer the entire output relation before printing it to stdout. This behavior can cause an OutOfMemoryError exception when the output relation is large. To resolve the OutOfMemoryError exception in Beeline, launch Beeline using the following command, and then retry the Hive query:

beeline --incremental=true

SQL Workbench/J:

In a 32-bit Java Runtime Environment (JRE), the application can use up to 1 GB of memory by default. In a 64-bit JRE, the application can use up to 65% of the available physical memory by default. To verify the amount of memory available to the application, choose Help and then choose About.

  • For macOS: Increase the -Xmx1024m value in the Info.plist file as needed. Info.plist is usually located in the /Applications/SQLWorkbenchJ2.app/Contents directory. For example, to double the amount of memory available to the application, change the value from -Xmx1024m to -Xmx2048m. Then, run the query again.
  • For Windows: Create an INI file and then add the vm.heapsize.preferred parameter to the INI file to increase the amount of memory that is available to the application.

If you're using shell or batch scripts, you can increase the available memory when you install SQL Workbench/J. The command in the following example creates 3 GB of available memory during installation:

java -Xmx3g -jar sqlworkbench.jar

Note: If the OutOfMemoryError exception is on the client side rather than on hive-server2 or the Hive CLI, save the output to Amazon Simple Storage Service (Amazon S3) or HDFS. Don’t use Beeline or SQL Workbench/J to view the query results.

Restarting Hive

If you modify Hive properties in hive-site.xml or hive-env.sh, you might need to restart Hive before the updated settings take effect. Restarting hive-server 2 impacts the running queries on the cluster. It's a best practice to restart the processes when no queries are running or during a planned maintenance window.

The commands that you use to restart hive processes vary depending on the Amazon EMR release version. For more information, see How do I restart a service in Amazon EMR?

For Amazon EMR release versions 5.30 and later or 6.0 and later:

1.    Connect to the master node using SSH.

2.    Restart the metastore:

sudo systemctl stop hive-hcatalog-server
sudo systemctl start hive-hcatalog-server
sudo systemctl status hive-hcatalog-server

3.    Restart hive-server 2:

sudo systemctl stop hive-server2
sudo systemctl start hive-server2
sudo systemctl status hive-server2

For Amazon EMR release versions 4.7.0 to 5.29:

1.    Connect to the master node using SSH.

2.    Restart the metastore:

sudo stop hive-hcatalog-server
sudo start hive-hcatalog-server
sudo status hive-hcatalog-server

Note: Don't attempt to restart the metastore using the sudo restart hive-hcatalog-server command.

3.    Restart hive-server2:

sudo stop hive-server2
sudo start hive-server2
sudo status hive-server2

For Amazon EMR release versions 4.0.0 to 4.6:

1.    Connect to the master node using SSH.

2.    Restart the metastore:

sudo stop hive-metastore
sudo start hive-metastore
sudo status hive-metastore

3.    Restart hive-server2:

sudo stop hive-server2
sudo start hive-server2
sudo status hive-server2

Related information

How can I use logs to troubleshoot issues with Hive queries in Amazon EMR?

Scaling cluster resources

Task configuration

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago