Kubernetes Examples
===================

This section provides a complete example for using the Enterprise Steam Python client on Kubernetes.

Launching and connecting to H2O cluster
---------------------------------------

This examples shows how to login to Steam and launch H2O cluster with 4 nodes and 10GB of memory per node.
The H2O cluster is using H2O version 3.28.0.2 and profile called ``default-h2o`` and submitting to the default YARN queue.
All other H2O parameters are pre-filled according to the selected profile.
When the cluster is up we connect to it and start importing data.

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import H2oKubernetesClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = H2oKubernetesClient().launch_cluster(name="test-cluster",
                                                   profile_name="default-h2o",
                                                   version="3.28.0.2",
                                                   node_count=4,
                                                   cpu_count=4,
                                                   gpu_count=0,
                                                   memory_gb=10)
    cluster.connect()
    airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip"
    airlines_df = h2o.import_file(path=airlines)

Providing dataset parameters to preset cluster size
---------------------------------------------------

This examples shows how to launch H2O cluster providing dataset information.
If you are not sure how to exactly size your cluster, you can provide either ``dataset_size_gb`` (for raw data source) or ``dataset_dimension`` tuple (for compressed data source) and specify whether you are going to use XGBoost algorithm on your cluster with ``using_xgboost`` parameter.
Setting these parameters will size the cluster accordingly.
If your profile does not allow to allocate recommended resources for the cluster, maximum allowed resources will be used.
Also any user-specified values of ``nodes``, ``node_memory_gb``, or ``extra_memory_percent`` will override recommended values.

Example using ``dataset_size_gb`` when using a CSV file as a data source:

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import H2oKubernetesClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = H2oKubernetesClient().launch_cluster(name="test-cluster",
                                                   profile_name="default-h2o",
                                                   version="3.28.0.2",
                                                   dataset_size_gb=20)

Example using ``dataset_dimension``, a tuple of (n_rows, n_cols) when using compressed file (e.q. parquet) as a data source:

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import H2oKubernetesClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = H2oKubernetesClient().launch_cluster(name="test-cluster",
                                                   profile_name="default-h2o",
                                                   version="3.28.0.2",
                                                   dataset_dimension=(25000, 1250))

Connecting to existing H2O cluster
----------------------------------

This example shows how to login to Steam and connect to existing H2O cluster called ``test-cluster`` and import data.

.. code-block:: python

    import h2o
    import h2osteam
    from h2osteam.clients import H2oKubernetesClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    cluster = H2oKubernetesClient.get_cluster("test-cluster")
    cluster.connect()
    airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip"
    airlines_df = h2o.import_file(path=airlines)

Launching and connecting to Driverless AI instance
--------------------------------------------------

This example shows how to create instance of Driverless AI v1.8.4.1, connect to it and upload dataset.

.. code-block:: python

    import h2osteam
    from h2osteam.clients import DriverlessClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    instance = DriverlessClient().launch_instance(name="test-instance",
                                                  version="1.8.4.1",
                                                  profile_name="default-driverless-kubernetes")
    client = instance.connect()

    # Import the iris dataset
    ds = client.datasets.create(
        data='s3://h2o-public-test-data/smalldata/iris/iris.csv',
        data_source='s3'
    )

Connecting to existing Driverless AI instance
---------------------------------------------

This example shows how to connect to existing Driverless AI instance called ``test-instance`` and upload dataset.

.. code-block:: python

    import h2osteam
    from h2osteam.clients import DriverlessClient

    h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True)
    instance = DriverlessClient().get_instance(name="test-instance")
    client = instance.connect()

    # Import the iris dataset
    ds = client.datasets.create(
        data='s3://h2o-public-test-data/smalldata/iris/iris.csv',
        data_source='s3'
    )

Managing multiple Steam connections in one session
--------------------------------------------------

This example shows how to manage multiple Steam connections in one session.

.. code-block:: python

    import h2osteam
    from h2osteam.clients import DriverlessClient

    adam_steam = h2osteam.login(url="https://steam.h2o.ai:9555", username="adam", password="adams-token-here", verify_ssl=True)

    # Initialize client with steam object
    adam_dai_client = DriverlessClient(adam_steam)

    # Use client to start an instance
    adam_instance = adam_dai_client.launch_instance(name="adams-instance",
                                                    version="1.9.2.0",
                                                    profile_name="default-driverless-kubernetes")

    # Start an instance with different user within the same session
    ben_steam = h2osteam.login(url="https://steam.h2o.ai:9555", username="ben", password="bens-token-here", verify_ssl=True)
    ben_dai_client = DriverlessClient(ben_steam)
    ben_dai_client.launch_instance(name="bens-instance",
                                   version="1.9.2.0",
                                   profile_name="default-driverless-kubernetes")

    # Terminate instance of the first user
    adam_instance.terminate()