.. _h2o:

H2O on Hadoop
=============

The **H2O** page shows clusters created by the current user, the state of the cluster, and the cluster creation date.
From this page, you can launch a new H2O cluster, view the details of existing clusters, or delete a cluster.

**Note**: When Enterprise Steam is started for the first time, no clusters will appear in the UI.

.. figure:: images/h2o-blank.png
    :alt: Blank H2O clusters page

Launch Cluster
--------------

1. Select **Launch New Cluster**.

2. Select a Cluster Profile from the dropdown menu to use when setting up the new cluster.
   Cluster profiles are configured by the Steam administrator and provide the allowed minimum, maximum and default values for each options in a cluster profile.

3. Configure the new cluster.

 - **Cluster Name**: Specify a name for this cluster.
 - **H2O Version**: Specify the H2O version to use.
 - **Dataset parameters**: Optionally provide estimated dataset parameters described in section below. Cluster parameters will be preset to accommodate your dataset within selected profile limits.
 - **Maximum Idle Time [HRS]**: Specify the maximum number of hours that the cluster can be idle before shutting down.
 - **Maximum Uptime [HRS]**: Specify the maximum number of hours that the cluster can be running.
 - **YARN Queue**: (Optional) If instructed by your Steam administrator select or enter the name of YARN queue that will be used for this H2O cluster.
   Note that the YARN queue cannot contain spaces. Leave this empty to use the default YARN queue.
 - **Save cluster data**: Only available if cluster saving is enabled by administrator in the selected profile. Choose whether you want to persist cluster data on cluster reaching uptime or idle time limit.
 - **Fault tolerant Grid Search**: Only available if cluster saving is enabled by administrator in the selected profile. Allow Enterprise Steam to backup Grid Search data. If the cluster fails during a Grid Search model training, Enterprise Steam will attempt to restart the cluster and continue with training.

4. Optionally provide estimated dataset parameters

 - **Dataset file format**: Choose a type of file format which serves as a data source for your experiment. Leave **unknown** value if you are not sure.
 - **Dataset size [GB]**: If the **Raw format** is chosen in the previous step, provide an estimated size of your dataset file.
 - **Number of rows**: Provide estimated number of rows of your dataset.
 - **Number of cols**: Provide estimated number of columns of your dataset.
 - **Using XGBoost**: Choose only if you plan to use XGBoost algorithm.


.. figure:: images/cluster-sizer.png
    :scale: 30
    :align: center
    :alt: Set dataset parameters
 
5. Optionally specify the following additional advanced options.

 - **Number of Nodes**: Specify the number of nodes of the H2O cluster.
 - **Memory per Node [GB]**: Specify the amount of memory allocated to a single node of the H2O cluster.
 - **Extra Memory [%]**: Specify the extra memory allocated to a single node as a percentage of memory per node.
   Algorithms like XGBoost use this additional memory, and you may need to increase this value if you are unable to build XGBoost models.
 - **H2O Threads per Node**: Specify the number of threads (CPUs) to use per node.
 - **YARN Virtual Cores per Node**: Specify the number of YARN virtual cores per node that will be requested from the YARN resource scheduler.
 - **Leader Node ID**: Specify which node of the H2O cluster you want to connect to. Nodes are indexed starting from 0.
 - **Startup Timeout [SEC]**: Specify the timeout duration (in seconds) to wait for the cluster to form before failing.

.. figure:: images/h2o-launch.png
    :alt: Launch new cluster
    
5. Click the **Launch New Cluster** button to launch a new cluster.

Upon successful validation of parameters, the cluster will begin starting and you will be taken back to the :ref:`h2o` page. It takes up to 5 minutes for H2O cluster to launch.

Cancel Cluster
--------------

To cancel a cluster in with status **Starting**, click the **Actions > Cancel** option. Confirm the cancellation by clicking the **Yes, Stop** button.
Cluster will transition to the **Stopped** state.

Running Cluster
---------------

Once the H2O cluster is up and running you can access H2O Flow or manage the cluster through the **Actions** button.

.. figure:: images/h2o-running.png
   :alt: H2O clusters page

Accessing H2O Flow
------------------

Once the cluster has started you may click on the cluster name. This opens H2O Flow in a new tab.

.. figure:: images/flow.png
   :alt: H2O Flow UI

Use the menu items at the top to import/upload your data into Flow and to build and score models.

- The **Data** dropdown allows you to import or upload a dataset, import SQL table, split or merge frames, and impute data.

  .. figure:: images/flow-data.png
     :alt: H2O Flow data menu

- Use the **Model** dropdown to select an algorithm and begin building models or to import/export models.

  .. figure:: images/flow-model.png
     :alt: H2O Flow model menu

Refer to the `H2O Flow documentation <http://docs.h2o.ai/h2o/latest-stable/h2o-docs/flow.html>`__ for more information on how to use H2O Flow.

Cluster Details
---------------

To view the details of a cluster, click the **Actions > Detail** option.
The cluster detail displays the following information:

.. figure:: images/h2o-detail.png
    :alt: H2O cluster details

Cluster Events
--------------

To view the events of a cluster, click the **Actions > Events** option.

.. figure:: images/h2o-events.png
    :alt: H2O cluster events

Cluster Logs
------------

You can see H2O driver, H2O, YARN and ML-Autodoc logs by clicking on the **Actions > Logs** option.
On this page you may download a complete log bundle for troubleshooting.

**Note**: YARN logs are not available when the cluster is running.

.. figure:: images/h2o-logs.png
   :alt: H2O cluster logs

Documentation
-------------

If the **Actions > Documentation** option is available it will take you to up-to-date documentation of H2O.

Launch Copy of Cluster
----------------------

You can launch a copy of a cluster by clicking the **Actions > Launch Copy** option.
You must give the cluster a name before you can launch it.

Stopping Clusters
-----------------

To stop a **Running** cluster, click the **Actions > Stop** option. When the confirmation window appears, click the **Yes, Stop** button to stop the cluster.

You can also choose whether to save cluster data if your profile allows it. If chosen, such cluster can be restarted and its data recovered.

.. figure:: images/hadoop-cluster-stop.png
    :scale: 30
    :align: center
    :alt: Stop cluster dialog

Restarting saved Clusters
-------------------------

Cluster that has been saved can be restarted and its data recovered by clicking on **Actions > Start** option. Following limitations apply based on used H2O version.

Cluster saving limitations
**************************

 - saved data can be loaded only into clusters with the same version of H2O
 - **3.32.0.1 and earlier versions**: all models are saved and restored
 - **3.32.0.2 and later**: all models and grids are saved and restored
 - **3.34.0.1 and later**: all models, grids and frames of data are saved and restored

.. figure:: images/cluster-start.png
    :scale: 40
    :align: center
    :alt: Stop cluster dialog

Marking Clusters as Failed
--------------------------

On a rare occasion, a cluster might get stuck in a **Stopping** state due to infrastructure failure. To manually resolve this situation, you can mark cluster as **Failed** with **Actions > Mark as failed**.


Deleting Clusters
-----------------

To delete a cluster, click the **Actions > Delete** option beside the cluster that you want to delete, then confirm the request.