Skip to main content
Version: v1.0.0

Configure deployments

This page describes the available options for configuring deployments using the H2O MLOps Python client.

Prerequisites

Before you begin, complete the following steps:

  1. Import the necessary Python packages. For instructions, see Step 1: Import the required packages.
  2. Connect to H2O MLOps. For instructions, see Connect to H2O MLOps.
  3. Create a workspace. For instructions, see Create a workspace.
  4. Create one or two experiments. For instructions, see Create an experiment.
  5. Create models and register the experiments with them. For instructions, see Register an experiment with a model.

To follow the examples in the next sections, assume the following:

  • You have created two models and assigned them to the variables model_1 and model_2. Each model has a registered experiment.
  • You have two scoring runtimes assigned to the variables scoring_runtime_1 and scoring_runtime_2, each corresponding to the respective experiment.

For details on how to retrieve a scoring runtime for an experiment, see Scoring runtimes.

Composition options

Composition options define how models are composed for deployment.

  • model: _models.MLOpsModel - The model to deploy.
  • scoring_runtime: _runtimes.MLOpsScoringRuntime - The runtime environment used for scoring.
  • model_version: Union[int, str] = "latest" - The version of the model to deploy.
  • traffic_weight: Optional[int] = None - The ratio of traffic to direct to a specific deployment in an A/B test.
  • primary: Optional[bool] = None - Indicates whether this deployment is the primary (champion) or secondary (challenger) in a champion/challenger setup.

There are three types of deployments: single model, A/B test, and champion/challenger. Each type requires different composition options, as described below:

composition_options = options.CompositionOptions(
model=model_1,
scoring_runtime=scoring_runtime_1,
)

For more information about deployment types, see Deployment type.

Security options

Security options let you configure security settings for your deployment.

  • security_type: types.SecurityType - The type of security to use.
    Available values:
    • DISABLED
    • PLAIN_PASSPHRASE
    • HASHED_PASSPHRASE
    • OIDC_AUTH: Requires additional configuration in the values.yaml file.
  • passphrase: Optional[str] = None - The passphrase to use, if required by the selected security type.

For more information, see Endpoint security.

note

Not all types are supported in every environment. Support can be configurable.

To check the allowed types using the H2O MLOps Python client, run: mlops.configs.allowed_security_types

Use the following code to create security options:

security_options = options.SecurityOptions(
security_type=types.SecurityType.HASHED_PASSPHRASE,
passphrase="123abcABC",
)

Kubernetes options

The following options let you customize Kubernetes deployment settings:

  • replicas: int = 1
  • requests: Optional[Dict[str, str]] = None
  • limits: Optional[Dict[str, str]] = None
  • affinity: Optional[str] = None
  • toleration: Optional[str] = None

For more information about replicas, requests, and limits, see Kubernetes options. For more information about affinity and toleration, see Node affinity and toleration.

note

Not all options are supported in every environment. Support can be configurable.

To check the default / allowed values for each option using the H2O MLOps Python client, run the following codes:

  • mlops.configs.default_k8s_requests
  • mlops.configs.default_k8s_limits
  • mlops.configs.allowed_k8s_affinities
  • mlops.configs.allowed_k8s_tolerations

Use the following code to create Kubernetes options:

kubernetes_options = options.KubernetesOptions()

Vertical Pod Autoscaler (VPA) options

Configure the following Vertical Pod Autoscaler (VPA) settings to automatically adjust resource requests:

  • resource_type: types.KubernetesResourceType - The type of resource to adjust.
    Supported values are:
    • CPU
    • MEMORY
  • unit: types.KubernetesResourceUnitType - The unit used for the resource.
    Supported values are:
    • MILLI_CORE
    • CORES
    • MIB
    • GIB
  • min_bound: float - The minimum resource request value allowed.
  • max_bound: float - The maximum resource request value allowed.

Use the following code to create VPA options:

vpa_options = [
options.VPAOptions(
resource_type=types.KubernetesResourceType.CPU,
unit=types.KubernetesResourceUnitType.MILLI_CORES,
min_bound=100,
max_bound=200,
),
options.VPAOptions(
resource_type=types.KubernetesResourceType.MEMORY,
unit=types.KubernetesResourceUnitType.MIB,
min_bound=200,
max_bound=400,
),
]

Pod Disruption Budget (PDB) options

Use the following options to control pod availability during voluntary disruptions:

  • pods: int - The number of pods.
  • disruption_policy: types.DisruptionPolicyType - The disruption policy to apply.
    Valid values:
    • MIN_AVAILABLE
    • MAX_UNAVAILABLE
  • is_percentage: bool = False - Indicates whether the pods value is a percentage.

For more information, see Pod Disruption Budget (PDB).

Use the following code to create PDB options:

pdb_options = options.PDBOptions(
pods=1,
disruption_policy=types.DisruptionPolicyType.MIN_AVAILABLE,
)

Environment variables

Specify environment variables to add to the scoring runtime:

environment_variables = {
"KEY_1": "VALUE_1",
"KEY_2": "VALUE_2",
"KEY_3": "VALUE_3",
}

CORS origins

Define allowed CORS origins:

cors_origins = [
"http://localhost:8080",
"http://customcors.com",
]

Monitoring options

Configure monitoring settings for your deployment:

monitoring_options = options.MonitoringOptions()

Feedback