Skip to main content
Version: Next 🚧

Configure deployments

This page describes the available options for configuring deployments using the H2O MLOps Python client.

Prerequisites​

Before you begin, complete the following steps:

  1. Import the necessary Python packages. For instructions, see Step 1: Import the required packages.
  2. Connect to H2O MLOps. For instructions, see Connect to H2O MLOps.
  3. Create a project. For instructions, see Create a project.
  4. Create one or two experiments. For instructions, see Create an experiment.
  5. Create models and register the experiments with them. For instructions, see Register an experiment with a model.

To follow the examples in the next sections, assume the following:

  • You have created two models and assigned them to the variables model_1 and model_2. Each model has a registered experiment.
  • You have two scoring runtimes assigned to the variables scoring_runtime_1 and scoring_runtime_2, each corresponding to the respective experiment.

For details on how to retrieve a scoring runtime for an experiment, see Scoring runtimes.

Composition options​

Composition options define how models are composed for deployment.

  • model: _models.MLOpsModel - The model to deploy.
  • scoring_runtime: _runtimes.MLOpsScoringRuntime - The runtime environment used for scoring.
  • model_version: Union[int, str] = "latest" - The version of the model to deploy.
  • traffic_weight: Optional[int] = None - The ratio of traffic to direct to a specific deployment in an A/B test.
  • primary: Optional[bool] = None - Indicates whether this deployment is the primary (champion) or secondary (challenger) in a champion/challenger setup.

There are three types of deployments: single model, A/B test, and champion/challenger. Each type requires different composition options, as described below:

composition_options = options.CompositionOptions(
model=model_1,
scoring_runtime=scoring_runtime_1,
)

Security options​

Security options let you configure security settings for your deployment.

  • security_type: types.SecurityType - The type of security to use.
    Available values:
    • DISABLED
    • PLAIN_PASSPHRASE
    • HASHED_PASSPHRASE
    • OIDC_AUTH
  • passphrase: Optional[str] = None - The passphrase to use, if required by the selected security type.

For more information on security options, see Endpoint security.

note

Not all types are supported in every environment. Support can be configurable.

Use the following code to create security options:

security_options = options.SecurityOptions(
security_type=types.SecurityType.HASHED_PASSPHRASE,
passphrase="123abcABC",
)

Kubernetes options​

The following options let you customize Kubernetes deployment settings:

  • replicas: int = 1
  • requests: Optional[Dict[str, str]] = None
  • limits: Optional[Dict[str, str]] = None
  • affinity: Optional[str] = None
  • toleration: Optional[str] = None

For more information about replicas, requests, and limits, see Kubernetes options. For more information about affinity and toleration, see Understanding node affinity and toleration.

Use the following code to create Kubernetes options:

kubernetes_options = options.KubernetesOptions()

Vertical Pod Autoscaler (VPA) options​

Configure the following Vertical Pod Autoscaler (VPA) settings to automatically adjust resource requests:

  • resource_type: types.KubernetesResourceType - The type of resource to adjust.
    Supported values are:
    • CPU
    • MEMORY
  • unit: types.KubernetesResourceUnitType - The unit used for the resource.
    Supported values are:
    • MILLI_CORE
    • CORES
    • MIB
    • GIB
  • min_bound: float - The minimum resource request value allowed.
  • max_bound: float - The maximum resource request value allowed.

Use the following code to create VPA options:

vpa_options = [
options.VPAOptions(
resource_type=types.KubernetesResourceType.CPU,
unit=types.KubernetesResourceUnitType.MILLI_CORES,
min_bound=100,
max_bound=200,
),
options.VPAOptions(
resource_type=types.KubernetesResourceType.MEMORY,
unit=types.KubernetesResourceUnitType.MIB,
min_bound=200,
max_bound=400,
),
]

Pod Disruption Budget (PDB) options​

Use the following options to control pod availability during voluntary disruptions:

  • pods: int - The number of pods.
  • disruption_policy: types.DisruptionPolicyType - The disruption policy to apply.
    Valid values:
    • MIN_AVAILABLE
    • MAX_UNAVAILABLE
  • is_percentage: bool = False - Indicates whether the pods value is a percentage.

For more information, see Pod Disruption Budget (PDB).

Use the following code to create PDB options:

pdb_options = options.PDBOptions(
pods=1,
disruption_policy=types.DisruptionPolicyType.MIN_AVAILABLE,
)

Environment variables​

Specify environment variables to add to the scoring runtime:

environment_variables = {
"KEY_1": "VALUE_1",
"KEY_2": "VALUE_2",
"KEY_3": "VALUE_3",
}

CORS origins​

Define allowed CORS origins:

cors_origins = [
"http://localhost:8080",
"http://customcors.com",
]

Monitoring options​

Configure monitoring settings for your deployment:

monitoring_options = options.MonitoringOptions()

Feedback