Version: v0.61.1

Install MLOps

H2O MLOps is installed and deployed in Kubernetes (K8s) environments, and can be run in all major clouds as well as on-premise. For information on how to install MLOps, contact support@h2o.ai.

Supported AWS regions

You can check whether a specific AWS region is support in the AWS Regional Services List. Use the drop-down to specify a specific region, then check to see if Amazon Elastic Kubernetes Service (EKS) is offered as a service in that region.

Configure node affinity and toleration

When installing MLOps, admin users can choose to set up node affinity and toleration. This section describes how to configure node affinity and toleration for MLOps scorers (pods) during the install process.

note

For more information on node affinity and toleration, refer to the following pages in the official Kubernetes documentation:

Understanding node affinity and toleration

As stated in the official Kubernetes documentation, "node affinity is a property of Pods that attracts them to a set of nodes, either as a preference or a hard requirement. Taints are the opposite—they allow a node to repel a set of pods. Tolerations are applied to pods, and allow (but do not require) the pods to schedule onto nodes with matching taints." In the case of MLOps, these options let you ensure that scorers (pods) are scheduled onto specific machines (nodes) in a cluster that have been set up for machine learning tasks.

Setup

In order to provide options for selecting node affinity and toleration when deploying a model, an admin must set up node affinity and toleration when installing MLOps.

note

MLOps supports all resources of the Kubernetes API. For more information, refer to the official Kubernetes API Reference page.

Node affinity

The following is an example of how node affinity can be set up when installing MLOps.

 kubernetes_node_affinity_shortcuts = [
    {
      name         = "required-gpu-preferred-v100"
      display_name = "GPU (Tesla V100)"
      description  = "Deploys on GPU-enabled nodes only, preferably one with Tesla V100 GPU."

      affinity = {
        required_during_scheduling_ignored_during_execution = {
          node_selector_terms = [
            {
              match_expressions = [
                {
                  key      = "gpu-type"
                  operator = "Exists"
                }
              ]
            }
          ]
        }

        preferred_during_scheduling_ignored_during_execution = [
          {
            weight = 1
            preference = {
              match_expressions = [
                {
                  key      = "gpu-type"
                  operator = "In"
                  values   = ["tesla-v100"]
                }
              ]
            }
          }
        ]
      }
    }
  ]

In the preceding example, the first block contains the standard name, display_name, and description fields required by Kubernetes. The second block (required_during_scheduling...) specifies the required node affinity matches. In the preceding example, the node is required to have a label named gpu-type in order for the deployed model to be scheduled on it. The third block (preferred_during_scheduling...) contains the preferred node affinity matches. In the preceding example, any node with a gpu-type label set to tesla-v100 is preferred, but not required.

Toleration

The following is an example of how toleration can be set up when installing MLOps:

   kubernetes_toleration_shortcuts = [
    {
      name         = "gpu-jobs-only"
      display_name = "Specialized GPU nodes OK"
      description  = "Tolerates nodes that are meant only for jobs requiring GPUs."
      tolerations  = [
        {
          effect             = "NoSchedule"
          key                = "gpu-jobs-only"
          operator           = "Exists"
        }
      ]
    },
    {
      name         = "disk-pressure-tolerant"
      display_name = "Disk-pressure tolerant"
      description  = "Tolerates nodes under disk pressure. Useful for short term models of negligible size."
      tolerations  = [
        {
          effect             = "NoSchedule"
          key                = "node.kubernetes.io/disk-pressure"
          operator           = "Exists"
        }
      ]
    }
  ]

In the preceding example, the first toleration (gpu-jobs-only) allows the model to be deployed on any node that has a taint called gpu-jobs-only. Nodes with this taint typically refuse new pods from being scheduled on them, but applying this toleration allows a model to be scheduled.

The second toleration (disk-pressure-tolerant) allows the model to be deployed on a node that is under memory pressure. By default, Kubernetes applies the node.kubernetes.io/disk-pressure taint to any node that is running low on disk space, and therefore refuses any new pods to be scheduled on those nodes. Applying this toleration, however, allows a model to be scheduled on nodes with this taint.

Feedback

Submit and view feedback for this page
Send feedback about H2O MLOps to cloud-feedback@h2o.ai

Supported AWS regions​

Configure node affinity and toleration​

Understanding node affinity and toleration​

Setup​

Node affinity​

Toleration​