Version: v0.66.1

Scoring runtimes

Overview
MLflow Dynamic Runtime
H2O Hydrogen Torch Runtime
vLLM Configuration

Overview

The selection of available runtimes is determined by the artifact type that you specify. The following list provides information on the available options when selecting an artifact type and runtime.

note

Selecting an incorrect runtime causes the deployment to fail.

Artifact type	Version	Runtime option	Notes
Driverless AI MOJO pipeline	DAI 1.9.3 and later	`DAI MOJO Scorer (Shapley none)`
Driverless AI MOJO pipeline	DAI 1.10.0 and later	`DAI MOJO Scorer (Shapley original only)`	Requires 2x the memory as the Shapley none option.
Driverless AI MOJO pipeline	DAI 1.9.3 and later	`DAI MOJO Scorer (Shapley transformed only)`	Requires 2x the memory as the Shapley none option.
Driverless AI MOJO pipeline	DAI 1.10.0 and later	`DAI MOJO Scorer (Shapley all)`	Requires 3x the memory as the Shapley none option.
Driverless AI MOJO pipeline	DAI 1.10.0 and later	`DAI MOJO Scorer (C++ Runtime)`	Experiment needs to be linked through project. Original Shapley requires DAI 1.10.3 and later. Transformed Shapley requires DAI 1.10.2 and later.
Driverless AI Python scoring pipeline	DAI 1.9.3	`Python Pipeline Scorer [DAI 1.9.3]`	No longer supported.
Driverless AI Python scoring pipeline	DAI 1.10.0 and later	`Python Pipeline Scorer [DAI 1.10.0]`, `Python Pipeline Scorer [DAI 1.10.4.3]`, `Python Pipeline Scorer [DAI 1.10.6.2]`, and `Python Pipeline Scorer [DAI 1.10.7]`	Python pipeline scorer’s version must correspond to the DAI version used to build the model (for example, a model built with DAI 1.10.4.2 must use Python Pipeline Scorer [DAI 1.10.4.2]).
H2O-3 MOJO	All versions	`H2O-3 MOJO Scorer`
MLflow / `.pkl` file		`MLflow Model Scorer [Python 3.8]` and `MLflow Model Scorer [Python 3.9]`	MLflow Model Scorer’s version must correspond to the Python version used to build the model.
MLflow		`[PY-3.8] MLflow Dynamic Model Scorer`, `[PY-3.9] MLflow Dynamic Model Scorer`, and `[PY-3.10] MLflow Dynamic Model Scorer`	For information on how to use the dynamic runtime, see MLflow Dynamic Runtime.
H2O Hydrogen Torch CPU scoring	HT 1.3.x	`[PY-3.8][CPU] H2O.ai Hydrogen-Torch 1.3.0 Runtime`	Run a Hydrogen Torch 1.3.x model with CPU only support. For more information, see the Hydrogen Torch documentation.
H2O Hydrogen Torch GPU scoring	HT 1.3.x	`[PY-3.8][GPU] H2O.ai Hydrogen-Torch 1.3.0 Runtime`	Run a Hydrogen torch 1.3.x model with GPU support. For more information, see the Hydrogen Torch documentation.
vLLM Configuration		`vLLM`	See vLLM Configuration.

note

The C++ MOJO2 runtime (DAI MOJO Scorer (C++ Runtime)) accepts a wider range of algorithms DAI may use that the Java runtime does not support, including BERT, GrowNet, and TensorFlow models. If you want to use one of these models, it must be linked from DAI and not be manually uploaded.

MLflow Dynamic Runtime

The MLflow Dynamic Runtime lets you deploy MLflow models with diverse dependencies in H2O MLOps. The following steps describe how to deploy a dynamic MLflow runtime deployment in H2O MLOps.

Note: For an example of how to train a dynamic runtime, see Train a dynamic runtime.

Save your model using the mlflow.pyfunc.save_model function call. Use the pip_requirements parameter to specify the Python package dependencies required by the model.

mlflow.pyfunc.save_model(
   path=...,
   python_model=...,
   artifacts=...,
   signature=...,
   pip_requirements=..., # <- Use this parameter to override libs for dynamic runtime
)

After saving the model, create a zip archive of the saved model directory. Ensure that a requirements file (requirements.txt) that lists all dependencies is included in the zip archive. The following is an example of the expected structure for the zip file from a TensorFlow model:

tf-model-py310
├── MLmodel
├── artifacts
│   └── tf.h5
├── conda.yaml
├── python_env.yaml
├── python_model.pkl
└── requirements.txt

Depending on whether you are using Python 3.8 or Python 3.9, select from one of the following options:

[PY-3.8] MLflow Dynamic Model Scorer
[PY-3.9] MLflow Dynamic Model Scorer
[PY-3.10] MLflow Dynamic Model Scorer

note

The MLflow Dynamic Runtime has a fixed MLflow dependency, which is MLflow 1.26.1. This means that the MLflow Dynamic Runtime is not guaranteed to work with a different version of MLflow model.

Example: Train a dynamic runtime model

The following example demonstrates how to train a dynamic runtime with TensorFlow:

# Import libraries
import mlflow
import pandas as pd
import shutil
import tensorflow as tf
from sklearn import datasets

# Load and prepare data
diabetes = datasets.load_diabetes()
X = diabetes.data[:, 2:3]  # Use only one feature for simplicity
y = diabetes.target

# Build and train TensorFlow model
tf_model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(1, input_dim=1)
])
tf_model.compile(optimizer='adam', loss='mean_squared_error')
tf_model.fit(X, y, epochs=10)

tf_model_path = "tf.h5"

tf_model.save(tf_model_path, save_format="h5")


# Enable the TensorFlow model to be used in the Pyfunc format
class PythonTFmodel(mlflow.pyfunc.PythonModel):
    def load_context(self, context):
        import tensorflow as tf
        self.model = tf.keras.models.load_model(context.artifacts["model"])

    def predict(self, context, model_input):
        tf_out = self.model.predict(model_input)
        return pd.DataFrame(tf_out, columns=["db_progress"])


# Generate signature from your model definition
model = PythonTFmodel()
context = mlflow.pyfunc.PythonModelContext(model_config=dict(), artifacts={"model": tf_model_path})
model.load_context(context)
x = pd.DataFrame(X, columns=["dense_input"])
y = model.predict(context, x)
signature = mlflow.models.signature.infer_signature(x, y)

# Specify a file path where the model will be saved
mlflow_model_path = "./tf-model-py310"

# Save model using MLflow
mlflow.pyfunc.save_model(
    path=mlflow_model_path,
    python_model=PythonTFmodel(),
    signature=signature,
    artifacts={"model": tf_model_path},
    pip_requirements=["tensorflow"]
)

# Package model as a zip archive
shutil.make_archive(
    mlflow_model_path, "zip", mlflow_model_path
)

The following is the structure of the zip file that is generated in the preceding example:

tf-model-py310
├── MLmodel
├── artifacts
│   └── tf.h5
├── conda.yaml
├── python_env.yaml
├── python_model.pkl
└── requirements.txt

H2O Hydrogen Torch runtime

Send request to HT text based model

payload = deployment_sample_request
payload["rows"] = [[f"this is a test for row {i}"] for i in range(10)]

r = requests.post(score_url, json=payload)

Send request to HT text span based model

payload = deployment_sample_request
payload["fields"] = ["question", "context"]
payload["rows"] = [[f"this is a test for question {i}", f"this is a test for context {i}"] for i in range(10)]

r = requests.post(score_url, json=payload)

Send request to HT audio based model

def read_binary(file_path):
    return open(file_path, 'rb')

files = [
    ('files', (f'test_audio_{i}.ogg', read_binary(hydrogen_torch_test_audio_file), 'application/octet-stream'))
    for i in range(10)
]
metadata = (
    'scoreMediaRequest',
    (
        None,
        json.dumps({
            "fields": ["input"],
            "media_fields": ["input"],
            "rows": [[f"test_audio_{i}.ogg"] for i in range(10)]
        }),
        "application/json"
    )
)
files.append(metadata)

score_url = score_url.replace('score', 'media-score')
r = requests.post(score_url, files=files)

Send request to HT image based model

def read_binary(file_path):
    return open(file_path, 'rb')

files = [
    ('files', (f'test_image_{i}.jpg', read_binary(hydrogen_torch_test_image_file), 'image/jpg'))
    for i in range(10)
]
metadata = (
    'scoreMediaRequest',
    (
        None,
        json.dumps({
            "fields": ["input"],
            "media_fields": ["input"],
            "rows": [[f"test_image_{i}.jpg"] for i in range(10)]
        }),
        "application/json"
    )
)
files.append(metadata)

score_url = score_url.replace('score', 'media-score')
r = requests.post(score_url, files=files)

vLLM Configuration

Create a new directory called artifacts.
Create a vllm.json file within the artifacts directory. Don't keep any other files in the directory.

Sample vllm.json file content:

{
	"model": "mistralai/Mistral-7B-Instruct-v0.2"
}

Create a zip file of the artifacts directory.

Feedback

Submit and view feedback for this page
Send feedback about H2O MLOps to cloud-feedback@h2o.ai

Overview​

MLflow Dynamic Runtime​

Example: Train a dynamic runtime model​

H2O Hydrogen Torch runtime​

Send request to HT text based model​

Send request to HT text span based model​

Send request to HT audio based model​

Send request to HT image based model​

vLLM Configuration​

Overview

MLflow Dynamic Runtime

Example: Train a dynamic runtime model

H2O Hydrogen Torch runtime

Send request to HT text based model

Send request to HT text span based model

Send request to HT audio based model

Send request to HT image based model

vLLM Configuration