Kubernetes Examples =================== This section provides a complete example for using the Enterprise Steam Python client on Kubernetes. Launching and connecting to H2O cluster --------------------------------------- This examples shows how to login to Steam and launch H2O cluster with 4 nodes and 10GB of memory per node. The H2O cluster is using H2O version 3.28.0.2 and profile called ``default-h2o`` and submitting to the default YARN queue. All other H2O parameters are pre-filled according to the selected profile. When the cluster is up we connect to it and start importing data. .. code-block:: python import h2o import h2osteam from h2osteam.clients import H2oKubernetesClient h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True) cluster = H2oKubernetesClient().launch_cluster(name="test-cluster", profile_name="default-h2o", version="3.28.0.2", node_count=4, cpu_count=4, gpu_count=0, memory_gb=10) cluster.connect() airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip" airlines_df = h2o.import_file(path=airlines) Providing dataset parameters to preset cluster size --------------------------------------------------- This examples shows how to launch H2O cluster providing dataset information. If you are not sure how to exactly size your cluster, you can provide either ``dataset_size_gb`` (for raw data source) or ``dataset_dimension`` tuple (for compressed data source) and specify whether you are going to use XGBoost algorithm on your cluster with ``using_xgboost`` parameter. Setting these parameters will size the cluster accordingly. If your profile does not allow to allocate recommended resources for the cluster, maximum allowed resources will be used. Also any user-specified values of ``nodes``, ``node_memory_gb``, or ``extra_memory_percent`` will override recommended values. Example using ``dataset_size_gb`` when using a CSV file as a data source: .. code-block:: python import h2o import h2osteam from h2osteam.clients import H2oKubernetesClient h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True) cluster = H2oKubernetesClient().launch_cluster(name="test-cluster", profile_name="default-h2o", version="3.28.0.2", dataset_size_gb=20) Example using ``dataset_dimension``, a tuple of (n_rows, n_cols) when using compressed file (e.q. parquet) as a data source: .. code-block:: python import h2o import h2osteam from h2osteam.clients import H2oKubernetesClient h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True) cluster = H2oKubernetesClient().launch_cluster(name="test-cluster", profile_name="default-h2o", version="3.28.0.2", dataset_dimension=(25000, 1250)) Connecting to existing H2O cluster ---------------------------------- This example shows how to login to Steam and connect to existing H2O cluster called ``test-cluster`` and import data. .. code-block:: python import h2o import h2osteam from h2osteam.clients import H2oKubernetesClient h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True) cluster = H2oKubernetesClient.get_cluster("test-cluster") cluster.connect() airlines = "http://s3.amazonaws.com/h2o-public-test-data/smalldata/airlines/allyears2k_headers.zip" airlines_df = h2o.import_file(path=airlines) Launching and connecting to Driverless AI instance -------------------------------------------------- This example shows how to create instance of Driverless AI v1.8.4.1, connect to it and upload dataset. .. code-block:: python import h2osteam from h2osteam.clients import DriverlessClient h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True) instance = DriverlessClient().launch_instance(name="test-instance", version="1.8.4.1", profile_name="default-driverless-kubernetes") client = instance.connect() # Import the iris dataset ds = client.datasets.create( data='s3://h2o-public-test-data/smalldata/iris/iris.csv', data_source='s3' ) Connecting to existing Driverless AI instance --------------------------------------------- This example shows how to connect to existing Driverless AI instance called ``test-instance`` and upload dataset. .. code-block:: python import h2osteam from h2osteam.clients import DriverlessClient h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="access-token-here", verify_ssl=True) instance = DriverlessClient().get_instance(name="test-instance") client = instance.connect() # Import the iris dataset ds = client.datasets.create( data='s3://h2o-public-test-data/smalldata/iris/iris.csv', data_source='s3' ) Managing multiple Steam connections in one session -------------------------------------------------- This example shows how to manage multiple Steam connections in one session. .. code-block:: python import h2osteam from h2osteam.clients import DriverlessClient adam_steam = h2osteam.login(url="https://steam.h2o.ai:9555", username="adam", password="adams-token-here", verify_ssl=True) # Initialize client with steam object adam_dai_client = DriverlessClient(adam_steam) # Use client to start an instance adam_instance = adam_dai_client.launch_instance(name="adams-instance", version="1.9.2.0", profile_name="default-driverless-kubernetes") # Start an instance with different user within the same session ben_steam = h2osteam.login(url="https://steam.h2o.ai:9555", username="ben", password="bens-token-here", verify_ssl=True) ben_dai_client = DriverlessClient(ben_steam) ben_dai_client.launch_instance(name="bens-instance", version="1.9.2.0", profile_name="default-driverless-kubernetes") # Terminate instance of the first user adam_instance.terminate()