Skip to main content
Version: v0.66.1

Requesting Contributions

Requesting contributions when scoring.

Prerequisites

Setup

Install the h2o_mlops_scoring_client with pip.

Notes

  • If running locally, the number of cores used (and thus parallel processes) can be overridden with:
num_cores = 10
h2o_mlops_scoring_client.spark_master = f"local[{num_cores}]"

Example Usage for Data Frames

import h2o_mlops_scoring_client
import pandas

Choose a feature type for contributions - either original or transformed.

CONTRIB_FEATURE_TYPE = h2o_mlops_scoring_client.FeatureType.ORIGINAL

Pass the feature type when scoring.

ID_COLUMN = "ID"
MLOPS_ENDPOINT_URL = "https://model.internal.dedicated.h2o.ai/65427177-dd10-44dd-abf8-76ab29f60799/model/score"
DATA_FRAME = pandas.read_csv("/Users/jgranados/datasets/creditcard.csv")

h2o_mlops_scoring_client.score_data_frame(
mlops_endpoint_url=MLOPS_ENDPOINT_URL,
data_frame=DATA_FRAME,
id_column=ID_COLUMN,
request_contributions=CONTRIB_FEATURE_TYPE
)
23/08/24 16:08:28 INFO h2o_mlops_scoring_client: Connecting to H2O.ai MLOps scorer at 'https://model.internal.dedicated.h2o.ai/65427177-dd10-44dd-abf8-76ab29f60799/model/score'
23/08/24 16:08:28 INFO h2o_mlops_scoring_client: Starting scoring data frame
23/08/24 16:09:04 INFO h2o_mlops_scoring_client: Scoring complete
23/08/24 16:09:04 INFO h2o_mlops_scoring_client: Total run time: 0:00:36
23/08/24 16:09:04 INFO h2o_mlops_scoring_client: Scoring run time: 0:00:36
IDdefault payment next month.0default payment next month.1contrib_LIMIT_BALcontrib_MARRIAGEcontrib_AGEcontrib_PAY_0contrib_PAY_2contrib_PAY_3contrib_PAY_4...contrib_BILL_AMT4contrib_BILL_AMT5contrib_BILL_AMT6contrib_PAY_AMT1contrib_PAY_AMT2contrib_PAY_AMT3contrib_PAY_AMT4contrib_PAY_AMT5contrib_PAY_AMT6contrib_bias
01.00.5483910.4516090.3187050.0542600.024333-0.1435270.1994370.1239910.001567...-0.0428730.004312-0.0449860.1900080.1164790.1968770.1078930.0600280.063311-1.516447
12.00.5597990.4402010.011380-0.044559-0.0115630.0178000.3539590.258357-0.009022...0.002973-0.003932-0.0349210.1849420.127812-0.081676-0.0103900.062603-0.015993-1.516447
23.00.9134880.0865120.091553-0.073675-0.014241-0.231872-0.036224-0.033809-0.019079...-0.013316-0.017721-0.0216810.0916260.125422-0.108036-0.012150-0.012263-0.018186-1.516447
34.00.7961320.2038680.2674490.0654990.0193060.061837-0.055391-0.033248-0.022298...-0.013313-0.0332970.0009840.068004-0.016718-0.073336-0.014324-0.039674-0.041868-1.516447
45.00.6015410.3984590.1706160.058417-0.0198280.940433-0.060075-0.303138-0.018102...-0.0046100.009174-0.0056280.064400-0.391913-0.065873-0.0219200.0366590.023616-1.516447
..................................................................
99423995.00.8663060.1336940.478230-0.0910260.014865-0.215355-0.036565-0.026022-0.020773...-0.000289-0.0216940.0089510.083203-0.071253-0.044852-0.022899-0.0285370.018177-1.516447
99523996.00.6085500.3914500.0873770.044217-0.0156290.0145380.2055780.199663-0.011091...0.021188-0.0201750.056376-0.0275470.197609-0.037477-0.051646-0.0515750.029902-1.516447
99623997.00.8149390.1850610.5366820.0994110.017427-0.208787-0.027522-0.022646-0.023438...0.002923-0.005027-0.0003140.081016-0.031194-0.071644-0.0152630.059949-0.030882-1.516447
99723998.00.8715890.1284110.481494-0.084286-0.029002-0.224106-0.039429-0.028018-0.022154...0.005962-0.013858-0.0356250.069608-0.038320-0.0667140.018526-0.049960-0.024566-1.516447
99823999.00.8685780.1314220.461986-0.083266-0.035072-0.229247-0.036450-0.013597-0.024662...0.0128860.010888-0.0664660.0292430.109742-0.081972-0.0284650.0526640.036372-1.516447

23999 rows × 25 columns

Example Usage for Source/Sink

import h2o_mlops_scoring_client

Choose a feature type for contributions - either original or transformed.

CONTRIB_FEATURE_TYPE = h2o_mlops_scoring_client.FeatureType.TRANSFORMED

Pass the feature type when scoring.

ID_COLUMN = "ID"
MLOPS_ENDPOINT_URL = "https://model.internal.dedicated.h2o.ai/65427177-dd10-44dd-abf8-76ab29f60799/model/score"
SOURCE_DATA = "file:///Users/jgranados/datasets/creditcard.csv"
SINK_LOCATION = "file:///Users/jgranados/datasets/output/"
SOURCE_FORMAT = h2o_mlops_scoring_client.Format.CSV
SINK_FORMAT = h2o_mlops_scoring_client.Format.CSV
SINK_WRITE_MODE = h2o_mlops_scoring_client.WriteMode.OVERWRITE

def preprocess(spark_df):
return spark_df.repartition(30)

h2o_mlops_scoring_client.score_source_sink(
mlops_endpoint_url=MLOPS_ENDPOINT_URL,
id_column=ID_COLUMN,
source_data=SOURCE_DATA,
source_format=SOURCE_FORMAT,
sink_location=SINK_LOCATION,
sink_format=SINK_FORMAT,
sink_write_mode=SINK_WRITE_MODE,
preprocess_method=preprocess,
request_contributions=CONTRIB_FEATURE_TYPE
)
23/08/24 16:09:42 INFO h2o_mlops_scoring_client: Connecting to H2O.ai MLOps scorer at 'https://model.internal.dedicated.h2o.ai/65427177-dd10-44dd-abf8-76ab29f60799/model/score'
23/08/24 16:09:45 INFO h2o_mlops_scoring_client: Applying preprocess method
23/08/24 16:09:45 INFO h2o_mlops_scoring_client: Starting scoring from 'file:///Users/jgranados/datasets/creditcard-borked.csv' to 'file:///Users/jgranados/datasets/output/'
23/08/24 16:10:16 INFO h2o_mlops_scoring_client: Scoring complete
23/08/24 16:10:16 INFO h2o_mlops_scoring_client: Total run time: 0:00:36
23/08/24 16:10:16 INFO h2o_mlops_scoring_client: Scoring run time: 0:00:31

Feedback