Skip to main content
Version: v0.69.1

Install MLOps

H2O MLOps is installed and deployed in Kubernetes (K8s) environments, and can be run in all major clouds as well as on-premise. For information on how to install MLOps, contact support@h2o.ai.

Supported AWS regions

You can check whether a specific AWS region is support in the AWS Regional Services List. Use the drop-down to specify a specific region, then check to see if Amazon Elastic Kubernetes Service (EKS) is offered as a service in that region.

JWT-based access control for H2O MLOps UI

To ensure the security and integrity of machine learning assets, it's crucial to implement robust access control for the H2O MLOps UI. This includes protecting sensitive data, preventing unauthorized access, maintaining regulatory compliance, and ensuring security throughout the machine learning lifecycle. To restrict access, the H2O MLOps UI expects that the allowed user groups can be retrieved from the wave access token (JWT) as a claim. This means that your IDP must be configured to include group information in the access token.

Configuration for access control

Flag to set whether authorization is enabled

enabled: true

JWT claim key that contains the role/group information.

userJwtClaim: "groups"

Comma-separated list of roles/groups that have access to the MLOps UI

allowedUserRoles: "admin"

Separator character for role/group values in JWT claim

userRoleValueSeparator: ","

Example HELM configuration for RBAC

The following is an example HELM configuration for Role-Based Access Control (RBAC). Note that in this example, the "admin" role is included in the user's JWT claim for access. Ensure that you customize this configuration based on your specific deployment requirements.

waveApp:
authorization:
enabled: true
userJwtClaim: "groups"
allowedUserRoles: "admin"
userRoleValueSeparator: ","

Configure node affinity and toleration

When installing MLOps, admin users can choose to set up node affinity and toleration. This section describes how to configure node affinity and toleration for MLOps scorers (pods) during the install process.

note

For more information on node affinity and toleration, refer to the following pages in the official Kubernetes documentation:

Understanding node affinity and toleration

As stated in the official Kubernetes documentation, "node affinity is a property of Pods that attracts them to a set of nodes, either as a preference or a hard requirement. Taints are the opposite—they allow a node to repel a set of pods. Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints." In the case of MLOps, these options let you ensure that scorers (pods) are scheduled onto specific machines (nodes) in a cluster that have been set up for machine learning tasks.

Setup

In order to provide options for selecting node affinity and toleration when deploying a model, an admin must set up node affinity and toleration when installing MLOps.

note

MLOps supports all resources of the Kubernetes API. For more information, refer to the official Kubernetes API Reference page.

Node affinity

The following is an example of how node affinity can be set up when installing MLOps.

 kubernetes_node_affinity_shortcuts = [
{
name = "required-gpu-preferred-v100"
display_name = "GPU (Tesla V100)"
description = "Deploys on GPU-enabled nodes only, preferably one with Tesla V100 GPU."

affinity = {
required_during_scheduling_ignored_during_execution = {
node_selector_terms = [
{
match_expressions = [
{
key = "gpu-type"
operator = "Exists"
}
]
}
]
}

preferred_during_scheduling_ignored_during_execution = [
{
weight = 1
preference = {
match_expressions = [
{
key = "gpu-type"
operator = "In"
values = ["tesla-v100"]
}
]
}
}
]
}
}
]

In the preceding example, the first block contains the standard name, display_name, and description fields required by Kubernetes. The second block (required_during_scheduling...) specifies the required node affinity matches. In the preceding example, the node is required to have a label named gpu-type in order for the deployed model to be scheduled on it. The third block (preferred_during_scheduling...) contains the preferred node affinity matches. In the preceding example, any node with a gpu-type label set to tesla-v100 is preferred, but not required.

Toleration

The following is an example of how toleration can be set up when installing MLOps:

   kubernetes_toleration_shortcuts = [
{
name = "gpu-jobs-only"
display_name = "Specialized GPU nodes OK"
description = "Tolerates nodes that are meant only for jobs requiring GPUs."
tolerations = [
{
effect = "NoSchedule"
key = "gpu-jobs-only"
operator = "Exists"
}
]
},
{
name = "disk-pressure-tolerant"
display_name = "Disk-pressure tolerant"
description = "Tolerates nodes under disk pressure. Useful for short term models of negligible size."
tolerations = [
{
effect = "NoSchedule"
key = "node.kubernetes.io/disk-pressure"
operator = "Exists"
}
]
}
]

In the preceding example, the first toleration (gpu-jobs-only) allows the model to be deployed on any node that has a taint called gpu-jobs-only. Nodes with this taint typically refuse new pods from being scheduled on them, but applying this toleration allows a model to be scheduled.

The second toleration (disk-pressure-tolerant) allows the model to be deployed on a node that is under memory pressure. By default, Kubernetes applies the node.kubernetes.io/disk-pressure taint to any node that is running low on disk space, and therefore refuses any new pods to be scheduled on those nodes. Applying this toleration, however, allows a model to be scheduled on nodes with this taint.

note

For more information about Pod Disruption Budget (PDB), see Pod Disruption Budget (PDB).


Feedback