.. _spark-f:

Sparkling Water Troubleshooting
===============================

This guide lists steps to troubleshoot failing Sparkling Water clusters on Hadoop.

Impersonation
-------------

The most common issue is that the impersonation is not setup or was setup incorrectly.

Follow the :ref:`impersonation` guide to set up impersonation.
Remember that for changes to take effect, the configuration must be applied to all nodes using the Cloudera Manager (or similar) and the HDFS service must be restarted.
To ensure the changes are successfully applied, open a console on the Steam host and check the contents of ``/etc/hadoop/conf/core-site.xml``.

Hadoop Conf Dir
---------------

In some cases, especially with multiple Spark versions on the same host, Spark may pick up a wrong configuration file.
If you notice that Sparkling Water clusters are failing after around 20 minutes, you must change Steam configuration.
As Steam administrator, navigate to **Configuration-Sparkling Water** and toggle the **Override Hadoop Conf Dir** option.
Restart Steam and try again.

Spark Submit
------------

A great validation step that uncovers most issues is to submit a test job using the spark-submit command on the Steam host.

First, you will need to obtain the current Steam configuration, because the values will be used in the testing command:

    - ``HADOOP_CONF_DIR`` from **Configuration-Hadoop**
    - ``SPARK_HOME`` from **Configuration-Sparkling Water**
    - ``PRINCIPAL`` from **Configuration-Hadoop** (only in Kerberized environment)
    - ``KEYTAB`` from **Configuration-Hadoop** (only in Kerberized environment)

Second, locate a JAR file that contains Spark examples (``EXAMPLES_JAR``), it is usually located relative to your ``SPARK_HOME``.
For example ``/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/jars/spark-example*.jar``.

Third, pick a ``USERNAME`` of a real Hadoop user that will use Enterprise Steam.

With the gathered information, open a terminal on the machine that hosts Enterprise Steam:

1. In Kerberized environment, ``kinit`` using the principal and keytab we obtained in the previous step. For example:

.. code-block:: bash

    kinit -kt steam.keytab -p steam@H2OAI.LOC

2. Export configuration. For example:

.. code-block:: bash

    export SPARK_HOME=/opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/spark
    export HADOOP_CONF_DIR=/etc/hadoop/conf
    export MASTER="yarn"

3. Submit the testing job:

.. code-block:: bash

    spark-submit \
        --master yarn \
        --deploy-mode cluster \
        --class org.apache.spark.examples.SparkPi \
        --proxy-user USERNAME \
        EXAMPLES_JAR 10

For example:

.. code-block:: bash

    spark-submit \
        --master yarn \
        --deploy-mode cluster \
        --class org.apache.spark.examples.SparkPi \
        --proxy-user john12 \
        /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/jars/spark-example*.jar 10

The job should be accepted by the Hadoop cluster, transition to running state and then to finished state in a short time.
If you encounter any errors, share them with your Hadoop team for resolution.