h2osteam.clients¶

h2osteam.clients.admin¶

`admin_client`¶

class h2osteam.clients.admin.admin_client.AdminClient¶

static delete_user_resources(username)¶

Delete driverless instances and h2o kubernetes clusters of given user.

Parameters: username – Name of the user.
Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.delete_user_resources(username="adam")

static import_h2o_engine(path)¶

Import H2O engine to Steam.

Imports H2O engine from Steam server and makes it available to users.

Parameters: path – Full path to the H2O engine on disk of the Steam server.
Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.import_h2o_engine("/tmp/h2o-3.26.0.6-cdh6.3.zip")

static import_python_environment(name, path)¶

Import an existing Python environment using the Python Pyspark Path.

Imports an existing Python environment to Steam using the path to the Python executable.

Parameters

name – Name of the new Python environment.
path – Full path to the python executable of the new Python environment.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.create_pyspark_python_path_environment("python3", "/tmp/virtual-env/python3/bin/python")

static import_sparkling_engine(path)¶

Import Sparkling Water engine to Steam.

Imports Sparkling Water engine from Steam server and makes it available to users.

Parameters: path – Full path to the Sparkling Water engine on disk of the Steam server.
Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.import_sparkling_engine("/tmp/sparkling-water-3.28.0.1-1-2.4.zip")

static upload_conda_environment(name, path)¶

Upload Conda Python environment.

Uploads and imports an existing Python environment using a path to a conda-packed Conda Python environment.

Parameters

name – Name of the new Python environment.
path – Full path to the conda-packed Conda Python environment.

Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.upload_conda_environment("python3-conda", "/tmp/conda-python3.tar.gz")

static upload_h2o_engine(path)¶

Upload H2O engine to Steam.

Uploads H2O engine from local machine to the Steam server where it is imported and made available to users.

Parameters: path – Full path to the H2O engine on disk of the local machine.
Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.upload_h2o_engine("/tmp/h2o-3.26.0.6-cdh6.3.zip")

static upload_sparkling_engine(path)¶

Upload Sparkling Water engine to Steam.

Uploads Sparkling Water engine from local machine to the Steam server where it is imported and made available to users.

Parameters: path – Full path to the Sparkling Water engine on disk of the local machine.
Examples

>>> import h2osteam
>>> from h2osteam.clients import AdminClient
>>> url = "https://steam.example.com:9555"
>>> username = "AzureDiamond"
>>> password = "hunter2"
>>> h2osteam.login(url=url, username=username, password=password, verify_ssl=True)
>>> AdminClient.upload_sparkling_engine("/tmp/sparkling-water-3.28.0.1-1-2.4.zip")

h2osteam.clients.admink8s¶

`admin_kubernetes_client`¶

class h2osteam.typed_clients.admink8s.admin_kubernetes_client.AdminKubernetesClient(steam=None)¶

add_dai_image(version, image, image_pull_policy='Always', image_pull_secret='', experimental=False)¶

Adds DriverlessAI Docker image to Steam.

Parameters

version (str) – version of the DAI release like 1.9.2.2
image (str) – full image name to be pulled like gcr.io/vorvan/h2oai/dai-centos7-x86_64:1.9.2.2-cuda10.0
image_pull_policy (str) – kubernetes image pull policy for a pod running DriverlessAI image
image_pull_secret (str) – reference to a secret in the same namespace to use for pulling the image.

This secret will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. :type experimental: bool :param experimental: experimental engine is not offered in the upgrade dialog and is listed last in the “create new” dropdown. Web UI only.

Return type: None
Returns: None

add_dai_python_client(url)¶

Uploads specified DriverlessAI Python client to Steam. The client version should be at least as new as the latest DAI version used in Steam.

Parameters: url (str) – URL to download DAI Python client from pipy.org like https://files.pythonhosted.org/packages/bd/65/05e5c6b5c8d32575e655d98dba182dee26ce04899df8d72d0ca7c409d92d/driverlessai-1.9.2.1.post1-py3-none-any.whl
Return type: None
Returns: None

add_h2o_image(version, image, image_pull_policy='Always', image_pull_secret='', experimental=False)¶

Adds H2O Docker image to Steam.

Parameters

version (str) – version of the H2O release like 3.32.1.3
image (str) – full image name to be pulled like h2oai/h2o-open-source-k8s:3.32.1.3
image_pull_policy (str) – kubernetes image pull policy for a pod running H2O image
image_pull_secret (str) – reference to a secret in the same namespace to use for pulling the image.

This secret will be passed to individual puller implementations for them to use. For example, in the case of docker, only DockerConfig type secrets are honored. :type experimental: bool :param experimental: experimental engine is not offered in the upgrade dialog and is listed last in the “create new” dropdown. Web UI only.

Return type: None
Returns: None

build_kubernetes_dai_profile(name, user_groups, instances_per_user, cpu_count, gpu_count, memory_gb, storage_gb, max_uptime_hours, max_idle_hours, timeout_seconds, license_manager_project_name='', config_toml='', allow_instance_config_toml=False, whitelist_instance_config_toml='', node_selector='', kubernetes_volumes=None, env='', custom_pod_labels='', custom_pod_annotations='', load_balancer_source_ranges='0.0.0.0/0', tolerations='', init_containers='', disabled=False, multinode=False, main_cpu_count=0, main_memory_gb=0, min_worker_count=0, max_worker_count=0, buffer_worker_count=0, worker_processor_count=0, worker_downscale_delay_seconds=0, main_processor_count=0, main_node_selector='')¶

Helper function to create a DriverlessAI Steam profile For parameters requesting multiple values, provide a tuple of four values (minimal, maximal, initial, profile_maximum) with profile_maximum=-1 indicating no limit. If a parameter requests three values, it means given parameter does not support profile_maximum limit.

Parameters

name (str) – Name of the profile.
user_groups (str) – Comma-seprarated list of groups assigned to this profile. Accepts wildcard ‘*’ character.
instances_per_user (int) – Limit the amount of H2O clusters a single user can launch with this profile.
cpu_count (Tuple[int, int, int, int]) – Specify the number of cpu units. One cpu, in Kubernetes, is equivalent to 1 vCPU/Core for cloud providers and 1 hyperthread on bare-metal Intel processors.
gpu_count (Tuple[int, int, int, int]) – Specify the number of GPUs.
memory_gb (Tuple[int, int, int, int]) – Specify the amount of memory in GB.
storage_gb (Tuple[int, int, int, int]) – Specify the amount of storage in GB.
max_uptime_hours (Tuple[int, int, int]) – Specify the duration in hours after which the instance will be automatically stopped if it has been idle for that long.
max_idle_hours (Tuple[int, int, int]) – Specify the duration in hours after which the instance will automatically stop.
timeout_seconds (Tuple[int, int, int]) – Instance will terminate if it was unable to start within this time limit.
license_manager_project_name (str) – License manager project name.
config_toml (str) – Enter additional Driverless AI configuration in TOML format that will be applied over the standard config.toml.
allow_instance_config_toml (bool) – Allow users to override allow-listed Driverless AI configuration in TOML format.
whitelist_instance_config_toml (str) – Enter additional Driverless AI configuration in TOML format that will be available to user instances for override.
node_selector (str) – Enter Kubernetes labels (using ‘key: value’ format, one per line). Instances will be scheduled only on Kubernetes nodes with these labels. The most common usage is one key-value pair. Leave empty to use any node.
kubernetes_volumes – List Kubernetes volume names that are mounted to clusters started using this profile.
env (str) – Enter extra environmental variables passed to the DriverlessAI image (using ‘NAME=value’ format, one per line).
custom_pod_labels (str) – Extra Kubernetes labels attached to pods of this profile. Use ‘key: value’ format, one per line.
custom_pod_annotations (str) – Extra Kubernetes annotations attached to pods of this profile. Use ‘key: value’ format, one per line.
load_balancer_source_ranges (str) – Restrict CIDR IP addresses for a LoadBalancer type service. Use one address range in format ‘143.231.0.0/16’ per line. Only applies if Steam is running outside of a Kubernetes cluster.
tolerations (str) – DAI pods tolerations. Provide text in Kubernetes readable YAML format. Example value:

tolerations: - key: “key1”

operator: “Equal” value: “value1” effect: “NoSchedule”

key: “key2” operator: “Exists” effect: “NoExecute”

Parameters: init_containers (str) – Initialization containers belonging to the DAI pod. Example value:

initContainers: - name: init-myservice

image: busybox:1.28 command: [‘sh’, ‘-c’, “until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done”]

name: init-mydb image: busybox:1.28 command: [‘sh’, ‘-c’, “until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done”]

Parameters

disabled (bool) – Disabled profile will be listed to the user but cant be used to start instances.
multinode (bool) – Enable multinode mode.
main_cpu_count (int) – Main server CPU count.
main_memory_gb (int) – Main server memory in GB.
min_worker_count (int) – Minimal number of worker nodes.
max_worker_count (int) – Maximal number of worker nodes.
buffer_worker_count (int) – Number of worker nodes to keep running when no jobs are running.
worker_processor_count (int) – Number of processors per worker node.
worker_downscale_delay_seconds (int) – Delay in seconds before worker node is terminated after job is finished.
main_processor_count (int) – Number of processors on the main node
main_node_selector (str) – Node selector for the main node.

Return type

Profile

Returns

Profile

build_kubernetes_h2o_profile(name, user_groups, instances_per_user, node_count, cpu_count, gpu_count, memory_gb, max_uptime_hours, max_idle_hours, timeout_seconds, java_options='', h2o_options='', env='', node_selector='', kubernetes_volumes=None, custom_pod_labels='', custom_pod_annotations='', tolerations='', init_containers='', disabled=False)¶

Helper function to create an H2O Kubernetes Steam profile For parameters requesting multiple values, provide a tuple of four values (minimal, maximal, initial, profile_maximum) with profile_maximum=-1 indicating no limit. If a parameter requests three values, it means given parameter does not support profile_maximum limit.

Parameters

name (str) – Name of this profile.
user_groups (str) – Comma-seprarated list of groups assigned to this profile. Accepts wildcard ‘*’ character.
instances_per_user (int) – Limit the amount of H2O clusters a single user can launch with this profile.
node_count (Tuple[int, int, int, int]) – Specify the number of nodes.
cpu_count (Tuple[int, int, int, int]) – Specify the number of cpu units. One cpu, in Kubernetes, is equivalent to 1 vCPU/Core for cloud providers and 1 hyperthread on bare-metal Intel processors.
gpu_count (Tuple[int, int, int, int]) – Specify the number of GPUs per node.
memory_gb (Tuple[int, int, int, int]) – Specify the amount of memory in GB per node.
max_uptime_hours (Tuple[int, int, int]) – Specify the duration in hours after which the cluster will be automatically stopped if it has been idle for that long. Provide a tuple of three values (minimal, maximal, initial).
max_idle_hours (Tuple[int, int, int]) – Specify the duration in hours after which the cluster will automatically stop.
timeout_seconds (Tuple[int, int, int]) – Cluster will terminate if it was unable to start within this time limit.
java_options (str) – Extra command line options passed to Java.
h2o_options (str) – Extra command line options passed to H2O-3.
env (str) – Enter extra environmental variables passed to the DriverlessAI image (using ‘NAME=value’ format, one per line).
node_selector (str) – Enter Kubernetes labels (using ‘key: value’ format, one per line). Instances will be scheduled only on Kubernetes nodes with these labels. The most common usage is one key-value pair. Leave empty to use any node.
kubernetes_volumes – List Kubernetes volume names that are mounted to clusters started using this profile.
custom_pod_labels (str) – Extra Kubernetes labels attached to pods of this profile. Use ‘key: value’ format, one per line.
custom_pod_annotations (str) – Extra Kubernetes annotations attached to pods of this profile. Use ‘key: value’ format, one per line.
tolerations (str) – H2O pods tolerations. Provide text in Kubernetes readable YAML format. Example value:

tolerations: - key: “key1”

operator: “Equal” value: “value1” effect: “NoSchedule”

key: “key2” operator: “Exists” effect: “NoExecute”

Parameters: init_containers (str) – Initialization containers belonging to the H2O pod. Example value:

initContainers: - name: init-myservice

image: busybox:1.28 command: [‘sh’, ‘-c’, “until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done”]

name: init-mydb image: busybox:1.28 command: [‘sh’, ‘-c’, “until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done”]

Parameters: disabled (bool) – Disabled profile will be listed to the user but cant be used to start instances.
Return type: Profile
Returns: Profile

static create_admin_account(steam_url, name, password, verify=False)¶

Creates unique local admin account. Any error during an account creation will be printed out.

Parameters

steam_url (str) – the base URL of the Steam instance without a tailing slash
name (str) – local administrator account name
password (str) – local administrator password
verify (bool) – (Optional) defaults to False, verify server’s TLS certificate

Return type

Tuple[int, str]

Returns

Tuple[int, str]: response return code, response body as a text

create_or_update_profile(p)¶

Creates or updates existing profile with given profile. Existing profile is matched by the profile name.

Parameters: p (Profile) – Profile, use build_kubernetes_driverless_profile or build_kubernetes_h2o_profile to create a Profile
Return type: None
Returns: None

download_dai_usage_statistics(path)¶: Downloads DriverlessAI usage statistics (in a CSV format). :type path: str :param path: Specify path for the downloaded file. Example: /home/user/report.csv :rtype: None :return:

download_h2o_usage_statistics(path)¶: Downloads H2O usage statistics (in a CSV format). :type path: str :param path: Specify path for the downloaded file. Example: /home/user/report.csv :rtype: None :return:

enable_dai_kubernetes(license_text, enabled=True, storage_directory='', backend_type='', oidc_auth=False, enable_triton=False)¶

Enables Driverless AI product deployment on Kubernetes backend. Kubernetes backend must be configured first using set_kubernetes_config function.

Parameters

license_text (str) – text representation of a valid Driverless AI license
enabled – Optional. Enable Driverless AI. Defaults to True.
storage_directory – Optional. Deprecated. Defaults to “” .
backend_type – Optional. Deprecated. Defaults to “” .
oidc_auth – Optional. Enable new OIDC authentication available from 1.10.3 version. Defaults to False.
enable_triton – Optional. Enable Triton inference server. Defaults to False.

Returns

enable_h2o_kubernetes()¶

Enables H2O product deployment on Kubernetes backend. Kubernetes backend must be configured first using set_kubernetes_config function.

Returns

internal_api()¶

Return type: TypedSteamConnection
Returns: Steam internal API. Defined methods are generated and subject to a possible change in any future release.

static profile_value(values)¶

Helper function to create a ProfileValue.

Parameters: values (Tuple[int, int, int]) – Tuple of 3 values: minimal, maximal and initial profile value
Return type: ProfileValue
Returns: ProfileValue

static profile_value_with_limit(values)¶

Helper function to create a ProfileValue with defined profile maximum limit.

Parameters: values (Tuple[int, int, int, int]) – Tuple of 4 values: minimal, maximal, initial and maximum profile limit of profile value
Return type: ProfileValue
Returns: ProfileValue

set_kubernetes_config(storage_class, namespace='', gpu_resource_name='nvidia.com/gpu', allow_volume_expansion=False, fallback_uid=0, fallback_gid=0, force_fallback=False, extra_load_balancer_annotations='', rwm_storage_class='', seccomp_profile_runtime_default=True)¶

Sets kubernetes configuration for Steam

Parameters

storage_class (str) – Name of the StorageClass object that manages provisioning of PersistentVolumes.
namespace (str) – Kubernetes namespace where all Steam objects live.
gpu_resource_name (str) – Resource name for GPU.
allow_volume_expansion (bool) – Allow expansion of user PersistentVolumes. Must be supported by the StorageClass.
fallback_uid (int) – Pods will be started using this fallback UID when FORCE FALLBACK UID/GID option is used.
fallback_gid (int) – Pods will be started using this fallback GID when FORCE FALLBACK UID/GID option is used.
force_fallback (bool) – Fallback UID/GID will be used overriding UID/GID received from authentication provider.
extra_load_balancer_annotations (str) – Extra LoadBalancer annotations in YAML format, one per line, key and value separated by ‘:’.
rwm_storage_class (str) – Name of the StorageClass object that manages provisioning of ReadWriteMany PersistentVolumes.
seccomp_profile_runtime_default (bool) – Use the RuntimeDefault seccomp profile.

Return type

None

Returns

set_oidc_auth_config(issuer, client_id, client_secret, steam_url, scopes='openid,offline_access,profile,email', userinfo_username_key='preferred_username', userinfo_email_key='email', userinfo_roles_key='groups', userinfo_uid_key='', userinfo_gid_key='', enable_logout_id_token_hint=True, acr_values='')¶

Sets default authentication method to OIDC with specified configuration parameters.

Parameters

issuer (str) – URL of the OpenID Provider server (ex: https://oidp.ourdomain.com)
client_id (str) – client ID registered with OpenID provider
client_secret (str) – secret associated with the client_id
steam_url (str) – the base URL of the Steam instance without a tailing slash
scopes (str) – Comma-separated list of scopes of user information Enterprise Steam will request from the OpenID provider.
userinfo_username_key (str) – Key that specifies username attribute from userinfo data (ex: preferred_username). Supports nesting (ex: realm1.key).
userinfo_email_key (str) – Key that specifies email attribute from userinfo data (ex: email). Supports nesting (ex: realm1.key).
userinfo_roles_key (str) – Key that specifies roles attribute from userinfo data (ex: roles). Supports nesting (ex: realm1.key).
userinfo_uid_key (str) – Key that specifies UNIX uid attribute from userinfo data. Supports nesting (ex: realm1.key).
userinfo_gid_key (str) – Key that specifies UNIX gid attribute from userinfo data. Supports nesting (ex: realm1.key).
enable_logout_id_token_hint (bool) – Indicates whether id_token_hint should be passed in a logout URL parameter.
acr_values (str) – Comma-separated list of allowed authentication context classes.

Return type

None

Returns

set_security_config(tls_cert_path='', tls_key_path='', server_strict_transport='max-age=631138519', server_x_xss_protection='0', server_content_security_policy="style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src 'self' https://fonts.gstatic.com data:;", session_duration_min=4320, personal_access_token_duration_hours=8760, web_ui_timeout_min=480, disable_admin=False, global_url_prefix='', secure_cookie=True, support_email='support@h2o.ai', hide_errors=False)¶

Sets security configuration for Steam

Parameters

tls_cert_path – Path to the server TLS certificate.
tls_key_path – Path to the server TLS key.
server_strict_transport – Value of the Strict-Transport-Security header in the server responses.
server_x_xss_protection – Value of the X-XSS-Protection header in the server responses.
server_content_security_policy – Value of the Content-Security-Policy header in the server responses.
session_duration_min – The lifespan of Steam issued cookie/JWT token.
personal_access_token_duration_hours – The lifespan of Steam issued personal access token.
web_ui_timeout_min – Users will be automatically logged out when reaching the idle timeout.
disable_admin – Disable the initial administrator account.
global_url_prefix – Global URL prefix for the Enterprise Steam.
secure_cookie – Set Secure cookie flag.
support_email – Change the target of Support email address.
hide_errors – If enabled, authentication errors will default to “forbidden” to hide configuration details.

Return type

None

Returns

h2osteam.clients.driverless¶

`driverless_client`¶

class h2osteam.clients.driverless.driverless_client.DriverlessClient(steam=None)¶

static connect(api, id, use_h2oai_client=False, use_own_client=False, backend_version_override=None)¶

static get_instance(name=None, created_by='')¶

Get existing Driverless AI instance.

The use of this static method is DEPRECATED in favour of DriverlessClient().get_instance() and will be removed in v1.9

Parameters

name – Name of the Driverless AI instance.
created_by – Name of the user that started the DAI instance.

Returns

Driverless AI instance as an DriverlessInstance object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")

get_instances()¶

Get a list of all Driverless AI instances that this user has permission to view.

Returns: List of DriverlessInstance objects.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here")
>>> instances = DriverlessClient().get_instances()

static launch_instance(name=None, version=None, profile_name=None, cpu_count=None, gpu_count=None, memory_gb=None, storage_gb=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, config_toml_override=None, sync=True, volumes='')¶

Launch new Driverless AI instance.

The use of this static method is DEPRECATED in favour of DriverlessClient().launch_instance() and will be removed in v1.9

Launches new Driverless AI instance using the parameters described below. You do not need to specify all parameters. In that case they will be filled based on the default values of the selected profile. The process of launching an instance can take up to 10 minutes.

Parameters

name – Name of the Driverless AI instance.
version – Version of Driverless AI.
profile_name – (Optional) Specify name of an existing profile that will be used for this cluster.
cpu_count – (Optional) Number of CPUs (threads or virtual CPUs).
gpu_count – (Optional) Number of GPUs.
memory_gb – (Optional) Amount of memory in GB.
storage_gb – (Optional) Amount of storage in GB.
max_idle_h – (Optional) Maximum amount of time in hours the Driverless AI instance can be idle before shutting down.
max_uptime_h – (Optional) Maximum amount of time in hours the the Driverless AI instance will be up before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the Driverless AI instance to start.
config_toml_override – (Optional) Enter additional Driverless AI configuration in TOML format that will be applied over the standard config.toml. Only available when permitted by selected profile. Override is limited to parameters allowed by profile.
volumes – (Optional) Specify unbound volumes to mount with this instance.
sync – Whether the call will block until the instance has finished launching. Otherwise use the wait() method.

Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().launch_instance(name="test-instance",
>>>                                               version="1.8.6.1",
>>>                                               profile_name="default-driverless-kubernetes",
>>>                                               gpu_count=1, memory_gb=32)

`driverless_instance`¶

class h2osteam.clients.driverless.driverless_instance.DriverlessInstance(instance_id, api)¶

connect(use_h2oai_client=False, use_own_client=False, backend_version_override=None)¶

Connect to the running Driverless AI instance using the Python client.

Parameters

use_h2oai_client – DEPRECATED! Set to True to use the deprecated h2oai_client instead of the new driverlessai client.
use_own_client – Set to True to use your own driverlessai client instead of the one provided by Steam.
backend_version_override – (Optional) version of client backend to use, overrides Driverless AI server

version detection. Specify “latest” to get the most recent backend supported. In most cases the user should rely: on Driverless AI server version detection and leave this as the default None.

Returns: driverlessai.Client class or h2oai_client.Client class.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> client = instance.connect()

details()¶

Get details of the Driverless AI instance.

Returns: Driverless AI instance details.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.details()

download_logs(path=None)¶

Download ZIP archive of the Driverless AI instance logs.

Parameters: path – Path where the Driverless AI logs archive will be saved.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.download_logs(path="/tmp/test-instance-logs.zip")

openid_login_url()¶: Returns an URL that is only valid with OpenID authentication and redirects to the Driverless AI instance

start(cpu_count=None, gpu_count=None, memory_gb=None, storage_gb=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, config_toml_override=None, sync=True)¶

Start stopped Driverless AI instance and optionally change the parameters of the instance. Unchanged parameters will stay the same.

Parameters

cpu_count – (Optional) Number of CPUs (threads or virtual CPUs).
gpu_count – (Optional) Number of GPUs.
memory_gb – (Optional) Amount of memory in GB.
storage_gb – (Optional) Amount of storage in GB.
max_idle_h – (Optional) Maximum amount of time in hours the Driverless AI instance can be idle before shutting down.
max_uptime_h – (Optional) Maximum amount of time in hours the the Driverless AI instance will be up before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the Driverless AI instance to start.
config_toml_override – (Optional) Enter additional Driverless AI configuration in TOML format that will be applied over the standard config.toml. Only available when permitted by selected profile. Override is limited to parameters allowed by profile.
sync – Whether the call will block until the instance has finished launching. Otherwise use the wait() method.

Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.start(memory_gb=64)

status()¶

Get status of Driverless AI instance.

Returns: Driverless AI instance status as string.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.status()
>>> # running

stop(sync=True)¶

Stop running Driverless AI instance.

Parameters: sync – Whether the call will block until the operation has been finished. Otherwise use the wait() method.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.stop()

terminate(sync=False)¶

Terminate stopped Driverless AI instance.

Parameters: sync – Whether the call will block until the operation has been finished. Otherwise use the wait() method.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.terminate()

upgrade_version(version)¶

Upgrades version of a non-running instance to a given version.

Parameters: version – Version to be upgraded to.
Examples

>>> import h2osteam
>>> from h2osteam.clients import DriverlessClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> instance = DriverlessClient().get_instance(name="test-instance")
>>> instance.stop() # Skip this step if instance is not running.
>>> instance.upgrade("1.10.3")
>>> instance.start()

wait()¶: Wait for Driverless AI instance to reach the same status as the target status.

`multinode_client`¶

class h2osteam.clients.driverless.multinode_client.MultinodeClient(steam=None)¶

static get_cluster(name)¶: DEPRECATED! Get existing Driverless AI multinode cluster.

static get_clusters()¶: DEPRECATED! Get existing Driverless AI multinode cluster.

static launch_cluster(name=None, version=None, profile_name=None, master_cpu_count=None, master_gpu_count=None, master_memory_gb=None, master_storage_gb=None, worker_count=None, worker_cpu_count=None, worker_gpu_count=None, worker_memory_gb=None, worker_storage_gb=None, autoscaling_enabled=False, autoscaling_min_workers=None, autoscaling_max_workers=None, autoscaling_buffer=None, autoscaling_downscale_delay_seconds=None, timeout_s=600)¶: DEPRECATED! Launch new Driverless AI multinode cluster.

class h2osteam.clients.driverless.multinode_client.MultinodeCluster(name, m=None, api=None)¶

connect(use_own_client=False, backend_version_override=None)¶

Connect to the running Driverless AI multinode cluster using the Python client.

Parameters

use_own_client – Set to True to use your own driverlessai client instead of the one provided by Steam.
backend_version_override – (Optional) version of client backend to use, overrides Driverless AI server

version detection. Specify “latest” to get the most recent backend supported. In most cases the user should rely: on Driverless AI server version detection and leave this as the default None.

Returns: driverlessai.Client class.
Examples

>>> import h2osteam
>>> from h2osteam.clients import MultinodeClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = MultinodeClient().get_cluster("dai-multinode-1")
>>> client = cluster.connect()

external_address()¶: Get address of the Driverless AI master node. This address is reverse-proxied by Enterprise Steam and is accessible from outside the Kubernetes cluster.

get_events()¶: Get events of the Driverless AI multinode cluster. Can be used for debugging purposes.

internal_address()¶: Get address of the Driverless AI master node. This address is accessible only inside the Kubernetes cluster.

is_master_ready()¶: Check whether the master node of the multinode cluster is ready and can be connected to.

openid_login_url()¶: Returns an URL that is only valid with OpenID authentication and redirects to the Driverless AI multinode master

refresh()¶

restart()¶

Restart failed Driverless AI multinode cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import MultinodeClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = MultinodeClient().get_cluster("dai-multinode-1")
>>> cluster.restart()

terminate()¶

Terminate Driverless AI multinode cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import MultinodeClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = MultinodeClient().get_cluster("dai-multinode-1")
>>> cluster.terminate()

wait()¶

Waits for Driverless AI multinode cluster to reach target status.

Examples

>>> import h2osteam
>>> from h2osteam.clients import MultinodeClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = MultinodeClient().launch_cluster(name="dai-multinode-1", ...)
>>> cluster.wait()
>>> cluster.connect()

h2osteam.clients.h2o¶

`h2o_client`¶

class h2osteam.clients.h2o.h2o_client.H2oClient¶

static get_cluster(name)¶

Get an existing H2O cluster.

Parameters: name – Name of the cluster.
Returns: H2O cluster as an H2oCluster object.
Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> H2oClient.get_cluster("test-cluster")

static get_clusters()¶

Get all H2O clusters available to this user.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> H2oClient.get_clusters()

static launch_cluster(name=None, version=None, dataset_size_gb=None, dataset_dimension=None, using_xgboost=False, profile_name=None, nodes=None, node_cpus=None, yarn_vcores=None, node_memory_gb=None, extra_memory_percent=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, yarn_queue='', leader_node_id=0, save_cluster_data=False)¶

Launch a new H2O cluster.

Launches a new H2O cluster using the parameters described below. You do not need to specify all parameters. In that case they will be filled based on the default values of the selected profile. The process of launching a cluster can take up to 5 minutes.

Parameters

name – Name of the new cluster.
version – Version of H2O that will be used in the cluster.
dataset_size_gb – (Optional) Specify size of your uncompressed dataset. For compressed data source, use dataset_dimension parameter. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
dataset_dimension – (Optional) Tuple of (n_rows, n_cols) representing an estimation of dataset dimensions. Use this parameter when you intend to use compressed data source like Parquet format. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
using_xgboost – (Optional) Set boolean value to indicate whether you want to use XGBoost on your cluster. extra_memory_percent parameter will be set accordingly. Does not override user-specified value.
profile_name – (Optional) Specify name of an existing profile that will be used for this cluster.
nodes – (Optional) Number of nodes of the H2O cluster.
node_cpus – (Optional) Number of CPUs/threads used by H2O on a single node. Specify ‘0’ to use all available CPUs/threads.
yarn_vcores – (Optional) Number of YARN virtual cores per cluster node. Should match node_cpus.
node_memory_gb – (Optional) Amount of memory in GB allocated for a single H2O node.
extra_memory_percent – (Optional) Percentage of extra memory that will be allocated outside of H2O JVM for algos like XGBoost.
max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.
max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.
yarn_queue – (Optional) Name of the YARN queue where the cluster will be placed.
leader_node_id – (Optional) ID of the H2O leader node.
save_cluster_data – (Optional) Set boolean value to indicate whether you want to save cluster data

(default False). Cluster data will be saved when the cluster reaches its uptime or idle time limit. Such cluster can be restarted with saved data automatically loaded. :returns: H2O cluster as an H2oCluster object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> H2oClient.launch_cluster(name="test-cluster", version="3.28.0.2", dataset_size_gb=2)

`h2o_cluster`¶

class h2osteam.clients.h2o.h2o_cluster.H2oCluster(cluster_id=None)¶

connect()¶

Connect to the H2O cluster using the H2O Python client.

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.connect()

delete()¶

Delete stopped H2O cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.delete()

download_logs(path=None)¶

Download logs of the H2O cluster.

Parameters: path – Path where the H2O logs will be saved.
Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.download_logs("/tmp/test-cluster-logs")

get_config()¶

Get connection config of the H2O cluster.

Get connection config of the H2O cluster that can be used as a parameter to h2o.connect. Use only if H2oCluster.connect() does not work for you.

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> h2o.connect(config=cluster.get_config())

start(dataset_size_gb=None, dataset_dimension=None, using_xgboost=False, profile_name=None, nodes=None, node_cpus=None, yarn_vcores=None, node_memory_gb=None, extra_memory_percent=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, yarn_queue='', leader_node_id=0, save_cluster_data=False, fault_tolerant_grid_search=False)¶

Start saved H2O cluster.

Starts a saved H2O cluster using the parameters described below. You dont need to provide any parameters. Unless provided, all launch parameters are copied from the stopped cluster except save_cluster_data which is set to False by default. You can override following launch parameters. The process of starting a cluster can take up to 5 minutes.

Parameters

dataset_size_gb – (Optional) Specify size of your uncompressed dataset. For compressed data source, use dataset_dimension parameter. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
dataset_dimension – (Optional) Tuple of (n_rows, n_cols) representing an estimation of dataset dimensions. Use this parameter when you intend to use compressed data source like Parquet format. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
using_xgboost – (Optional) Set boolean value to indicate whether you want to use XGBoost on your cluster. extra_memory_percent parameter will be set accordingly. Does not override user-specified value.
profile_name – (Optional) Specify name of an existing profile that will be used for this cluster.
nodes – (Optional) Number of nodes of the H2O cluster.
node_cpus – (Optional) Number of CPUs/threads used by H2O on a single node. Specify ‘0’ to use all available CPUs/threads.
yarn_vcores – (Optional) Number of YARN virtual cores per cluster node. Should match node_cpus.
node_memory_gb – (Optional) Amount of memory in GB allocated for a single H2O node.
extra_memory_percent – (Optional) Percentage of extra memory that will be allocated outside of H2O JVM for algos like XGBoost.
max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.
max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.
yarn_queue – (Optional) Name of the YARN queue where the cluster will be placed.
leader_node_id – (Optional) ID of the H2O leader node.
save_cluster_data – (Optional) Set boolean value to indicate whether you want to save cluster data

(default False). Cluster data will be saved when the cluster reaches its uptime or idle time limit. Such cluster can be restarted with saved data automatically loaded. :param fault_tolerant_grid_search: (Optional) Set boolean value to indicate whether you want to use Grid Search in a fault tolerant mode (default False). In this mode, when the cluster fails while training a Grid Search model, it will attempt to restart itself and continue training. Reaching idle or uptime limit is not considered a failure. :returns: H2O cluster as an H2oCluster object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster(name="test-cluster")
>>> cluster.stop(save_cluster_data=True)
>>> cluster.start(dataset_dimension=(10000,500), using_xgboost=True)

status()¶

Get status of the H2O cluster.

Returns: H2O cluster status as a string.
Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.status()
>>> # running

stop(save_cluster_data=False)¶

Stop a running H2O cluster.

Parameters: save_cluster_data – (Optional) Set boolean value to indicate whether you want to save cluster data.

Such cluster can be restarted with saved data automatically loaded.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oClient.get_cluster("test-cluster")
>>> cluster.stop(save_cluster_data=True)

wait()¶: Wait for H2O cluster to finish launching.

h2osteam.clients.h2ok8s¶

class h2osteam.clients.h2ok8s.h2ok8s.H2oKubernetesClient(steam=None)¶

static get_cluster(name, created_by='')¶

Get existing H2O cluster.

The use of this static method is DEPRECATED in favour of H2oKubernetesClient().get_cluster() and will be removed in v1.9

Parameters

name – Name of the H2O cluster.
created_by – Name of the user that started the H2O cluster.

Returns

H2O cluster as an H2oKubernetesCluster object.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster("h2o-1")

static get_clusters()¶

Get all existing H2O clusters.

The use of this static method is DEPRECATED in favour of H2oKubernetesClient().get_clusters() and will be removed in v1.9

Returns: List of H2oKubernetesCluster objects.
Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> clusters = H2oKubernetesClient().get_clusters()

static launch_cluster(name=None, profile_name=None, version=None, dataset_size_gb=None, dataset_dimension=None, node_count=None, cpu_count=None, gpu_count=None, memory_gb=None, max_uptime_h=None, max_idle_h=None, timeout_s=None, volumes='')¶

Launch new H2O cluster on Kubernetes.

The use of this static method is DEPRECATED in favour of H2oKubernetesClient().launch_cluster() and will be removed in v1.9

Parameters

name – Name of the H2O cluster.
profile_name – Specify name of an existing profile that will be used for this cluster.
version – Version of H2O.
dataset_size_gb – (Optional) Specify size of your uncompressed dataset. For compressed data source, use dataset_dimension parameter. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
dataset_dimension – (Optional) Tuple of (n_rows, n_cols) representing an estimation of dataset dimensions. Use this parameter when you intend to use compressed data source like Parquet format. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
node_count – Number of nodes.
cpu_count – Number of CPUs (threads or virtual CPUs) per node.
gpu_count – Number of GPUs per node.
memory_gb – Amount of memory in GB per node.
max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.
max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.
volumes – (Optional) Specify unbound volumes to mount with this instance.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().launch_cluster(name="h2o-1",
>>>                                                version="3.32.0.1",
>>>                                                node_count=4,
>>>                                                cpu_count=1,
>>>                                                gpu_count=0,
>>>                                                memory_gb=16,
>>>                                                max_idle_h=8,
>>>                                                max_uptime_h=240,
>>>                                                timeout_s=600)

class h2osteam.clients.h2ok8s.h2ok8s.H2oKubernetesCluster(name, c=None, api=None, created_by='')¶

connect()¶

Connects to the H2O cluster using the H2O Python client.

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster("test-cluster")
>>> cluster.connect()

download_logs(path=None)¶

Download logs of the H2O cluster.

Parameters: path – Path where the H2O cluster logs will be saved.
Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster(name="test-cluster")
>>> cluster.download_logs(path="/tmp/test-cluster-logs")

fail()¶

Marks the H2O cluster as failed. Use only when cluster is stuck and cannot be terminated using the terminate function.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster("h2o-1")
>>> cluster.fail()

get_connection_config()¶

Get connection config of the H2O cluster.

It is used as a parameter to h2o.connect(). Consider using H2oKubernetesCluster.connect().

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster("test-cluster")
>>> h2o.connect(config=cluster.get_connection_config())

get_events()¶: Get events of the H2O cluster. Can be used for debugging purposes.

is_failed()¶: Check whether the H2O cluster has failed.

is_running()¶: Check whether the H2O cluster is running and can be connected to.

refresh()¶: Refreshes the cluster information.

stop()¶

Stops H2O cluster on Kubernetes.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster("h2o-1")
>>> cluster.stop()

terminate()¶

Stops and deletes H2O cluster on Kubernetes.

Examples

>>> import h2osteam
>>> from h2osteam.clients import H2oKubernetesClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = H2oKubernetesClient().get_cluster("h2o-1")
>>> cluster.terminate()

wait()¶: Blocks until the current status has reached the target status.

h2osteam.clients.sparkling¶

`sparkling_client`¶

class h2osteam.clients.sparkling.sparkling_client.SparklingClient¶

static get_cluster(name=None)¶

Get an existing Sparkling Water cluster.

Parameters: name – Name of the cluster.
Returns: Sparkling Water cluster as an SparklingSession object.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> SparklingClient.get_cluster("test-cluster")

static get_clusters()¶

Get all Sparkling Water clusters available to this user.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> SparklingClient.get_clusters()

static launch_sparkling_cluster(name=None, version=None, profile_name=None, dataset_size_gb=None, dataset_dimension=None, using_xgboost=False, python_environment_name=None, driver_cores=None, driver_memory_gb=None, executors=None, executor_cores=None, executor_memory_gb=None, h2o_nodes=None, h2o_node_memory_gb=None, h2o_node_cpus=None, h2o_extra_memory_percent=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, yarn_queue='', spark_properties=None, save_cluster_data=False)¶

Launch a new Sparkling Water cluster.

Launches a new Sparkling Water cluster using the parameters described below. You do not need to specify all parameters. In that case they will be filled based on the default value of the selected profile. The process of launching a cluster can take up to 5 minutes.

Parameters

name – Name of the cluster.
version – Version of Sparkling Water that will be used in the cluster.
dataset_size_gb – (Optional) Specify size of your uncompressed dataset. For compressed data source, use dataset_dimension parameter. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
dataset_dimension – (Optional) Tuple of (n_rows, n_cols) representing an estimation of dataset dimensions. Use this parameter when you intend to use compressed data source like Parquet format. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
using_xgboost – (Optional) Set boolean value to indicate whether you want to use XGBoost on your cluster. extra_memory_percent parameter will be set accordingly. Does not override user-specified value.
profile_name – (Optional) Name of the profile for the cluster.
python_environment_name – (Optional) Specify the Python environment name you want to use.
driver_cores – (Optional) Number of Spark driver cores.
driver_memory_gb – (Optional) Amount of Spark driver memory in GB.
executors – (Optional) Number of Spark executors.
executor_cores – (Optional) Number of Spark executor cores.
executor_memory_gb – (Optional) Amount of Spark executor memory in GB.
h2o_nodes – (Optional) Specify the number of H2O nodes for the cluster.
h2o_node_memory_gb – (Optional) Specify the amount of memory that should be available on each H2O node.
h2o_node_cpus – (Optional) Number of CPUs/threads used by H2O on a single node. Specify ‘0’ to use all available CPUs/threads.
h2o_extra_memory_percent – (Optional) Specify the amount of extra memory for internal JVM use outside of the Java heap.
max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.
max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.
spark_properties – (Optional) Specify additional spark properties as a Python dictionary.
yarn_queue – (Optional) Name of the YARN queue where the cluster will be placed.
save_cluster_data – (Optional) Set boolean value to indicate whether you want to save cluster data

(default False). Cluster data will be saved when the cluster reaches its uptime or idle time limit. Such cluster can be restarted with saved data automatically loaded.

Returns: Sparkling cluster as an SparklingSession object.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> SparklingClient.launch_sparkling_cluster(name="test-cluster", version="3.28.0.2", dataset_size_gb=2)

`sparkling_cluster`¶

class h2osteam.clients.sparkling.sparkling_cluster.SparklingSession(cluster)¶

connect_h2o()¶

Connects to the underlying H2O cluster. Useful for generating an H2O Autodoc report.

Returns: H2o session object connected to the cluster.
Examples

>>> # Example code for generating an H2O Autodoc report using h2o() function
>>> import h2osteam
>>> import h2o
>>> from h2o_autodoc import Config
>>> from h2o_autodoc import render_autodoc
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> session = SparklingClient.get_cluster("test-cluster")
>>> h2o_cluster = session.connect_h2o()
>>> model = h2o_cluster.get_model("my_model")
>>> config = Config(output_path="report.docx")
>>> render_autodoc(h2o_cluster, config, model)

delete()¶

Delete stopped Sparkling Water cluster.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.delete()

download_logs(path=None)¶

Download logs of the Sparkling cluster.

Parameters: path – Path where the Sparkling logs will be saved.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.download_logs("/tmp/test-cluster-logs")

get_h2o_config()¶

Get connection config of the H2O cluster.

Get connection config of the H2O cluster that can be used as a parameter to h2o.connect.

Examples

>>> import h2o
>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> h2o.connect(config=cluster.get_h2o_config())

send_statement(statement=None)¶

Send a single statement to the remote spark session.

Parameters: statement – A string representation of statement for the Spark session.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.send_statement("f_crimes = h2o.import_file(path ="../data/chicagoCrimes10k.csv",col_types =column_type)")

session()¶

Connect to the remote Spark session and issue commands.

Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.session()

start(profile_name=None, dataset_size_gb=None, dataset_dimension=None, using_xgboost=False, python_environment_name=None, driver_cores=None, driver_memory_gb=None, executors=None, executor_cores=None, executor_memory_gb=None, h2o_nodes=None, h2o_node_memory_gb=None, h2o_node_cpus=None, h2o_extra_memory_percent=None, max_idle_h=None, max_uptime_h=None, timeout_s=None, yarn_queue='', spark_properties=None, save_cluster_data=False)¶

Start saved Sparkling Water cluster.

Starts a saved Sparkling Water cluster using the parameters described below. You dont need to provide any parameters. Unless provided, all launch parameters are copied from the stopped cluster except save_cluster_data which is set to False by default. You can override following launch parameters. The process of starting a cluster can take up to 5 minutes.

Parameters

dataset_size_gb – (Optional) Specify size of your uncompressed dataset. For compressed data source, use dataset_dimension parameter. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
dataset_dimension – (Optional) Tuple of (n_rows, n_cols) representing an estimation of dataset dimensions. Use this parameter when you intend to use compressed data source like Parquet format. Cluster parameters will be preset to accommodate your dataset within selected profile limits. Does not override user-specified values.
using_xgboost – (Optional) Set boolean value to indicate whether you want to use XGBoost on your cluster. extra_memory_percent parameter will be set accordingly. Does not override user-specified value.
profile_name – (Optional) Specify name of an existing profile that will be used for this cluster.
python_environment_name – (Optional) Specify the Python environment name you want to use.
driver_cores – (Optional) Number of Spark driver cores.
driver_memory_gb – (Optional) Amount of Spark driver memory in GB.
executors – (Optional) Number of Spark executors.
executor_cores – (Optional) Number of Spark executor cores.
executor_memory_gb – (Optional) Amount of Spark executor memory in GB.
h2o_nodes – (Optional) Specify the number of H2O nodes for the cluster.
h2o_node_memory_gb – (Optional) Specify the amount of memory that should be available on each H2O node.
h2o_node_cpus – (Optional) Number of CPUs/threads used by H2O on a single node. Specify ‘0’ to use all available CPUs/threads.
h2o_extra_memory_percent – (Optional) Specify the amount of extra memory for internal JVM use outside of the Java heap.
max_idle_h – (Optional) Maximum amount of time in hours the cluster can be idle before shutting down.
max_uptime_h – (Optional) Maximum amount of time in hours the cluster will be up before shutting down.
timeout_s – (Optional) Maximum amount of time in seconds to wait for the H2O cluster to start.
yarn_queue – (Optional) Name of the YARN queue where the cluster will be placed.
spark_properties – (Optional) Specify additional spark properties as a Python dictionary.
save_cluster_data – (Optional) Set boolean value to indicate whether you want to save cluster data

(default False). Cluster data will be saved when the cluster reaches its uptime or idle time limit. Such cluster can be restarted with saved data automatically loaded.

Returns: Sparkling cluster as an SparklingSession object.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster(name="test-cluster")
>>> cluster.stop(save_cluster_data=True)
>>> cluster.start(dataset_size_gb=500, using_xgboost=False, save_cluster_data=True)

status()¶

Get status of the Sparkling Water cluster.

Returns: Sparkling Water cluster status as a string.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.status()
>>> # running

stop(save_cluster_data=False)¶

Stop a running Sparkling Water cluster.

Parameters: save_cluster_data – (Optional) Set boolean value to indicate whether you want to save cluster data. Such cluster can be restarted with saved data automatically loaded.
Examples

>>> import h2osteam
>>> from h2osteam.clients import SparklingClient
>>> h2osteam.login(url="https://steam.h2o.ai:9555", username="user01", password="token-here", verify_ssl=True)
>>> cluster = SparklingClient.get_cluster("test-cluster")
>>> cluster.stop(save_cluster_data=True)

wait()¶: Wait for the Sparkling Water cluster to finish launching.

class h2osteam.clients.sparkling.sparkling_cluster.SparklingShell(session)¶

Instantiate a line-oriented interpreter framework.

The optional argument ‘completekey’ is the readline name of a completion key; it defaults to the Tab key. If completekey is not None and the readline module is available, command completion is done automatically. The optional arguments stdin and stdout specify alternate input and output file objects; if not specified, sys.stdin and sys.stdout are used.

onecmd(s)¶

Interpret the argument as though it had been typed in response to the prompt.

This may be overridden, but should not normally need to be; see the precmd() and postcmd() methods for useful execution hooks. The return value is a flag indicating whether interpretation of commands by the interpreter should stop.

postloop()¶: Hook method executed once when the cmdloop() method is about to return.

preloop()¶: Hook method executed once when the cmdloop() method is called.

h2osteam.clients¶

h2osteam.clients.admin¶

admin_client¶

h2osteam.clients.admink8s¶

admin_kubernetes_client¶

h2osteam.clients.driverless¶

driverless_client¶

driverless_instance¶

multinode_client¶

h2osteam.clients.h2o¶

h2o_client¶

h2o_cluster¶

h2osteam.clients.h2ok8s¶

h2osteam.clients.sparkling¶

sparkling_client¶

sparkling_cluster¶

`admin_client`¶

`admin_kubernetes_client`¶

`driverless_client`¶

`driverless_instance`¶

`multinode_client`¶

`h2o_client`¶

`h2o_cluster`¶

`sparkling_client`¶

`sparkling_cluster`¶