Scoring Pandas Data Frames
Do parallelized scoring of a data frame in mini-batches against a MLOps deployment.
Prerequisites
h2o_mlops_scoring_client-*-py3-none-any.whl
file or access to the Python Package Index (PyPI)
Setup
Install the h2o_mlops_scoring_client
with pip
.
Example Usage
- Import libraries:
import h2o_mlops_scoring_client
import pandas
- Choose the MLOps scoring endpoint:
MLOPS_ENDPOINT_URL = "https://model.internal.dedicated.h2o.ai/d4d36117-c94a-4182-8b75-5f5abbd1c28b/model/score"
- Get a data frame to use along with a unique ID column used to identify each score.
DATA_FRAME = pandas.read_csv("/Users/jgranados/datasets/BNPParibas.csv")
ID_COLUMN = "ID"
- And now we score.
Description of arguments for scoring:
mlops_endpoint_url
: MLOps deployment scoring endpoint URL.id_column
: Name of column in data to be scored. Note that the column must contain unique row identifiers.data_frame
: Pandas or Spark data frame.cpus
: Number of CPU cores to use for scoring Pandas data frames. For best performance with Java MOJO deployments, setcpus
to be four times the number of deployment replicas. For any type of deployment, settingcpus
to more than four times the replicas will likely not bring additional benefit unless model monitoring is disabled. For slow scoring deployments like the DAI Python scoring pipeline, less than four times the replicas may increase throughput.
pandas_df = h2o_mlops_scoring_client.score_data_frame(
mlops_endpoint_url=MLOPS_ENDPOINT_URL,
id_column=ID_COLUMN,
data_frame=DATA_FRAME,
)
23/08/21 14:23:58 INFO h2o_mlops_scoring_client: Connecting to H2O.ai MLOps scorer at 'https://model.internal.dedicated.h2o.ai/d4d36117-c94a-4182-8b75-5f5abbd1c28b/model/score' 23/08/21 14:23:59 INFO h2o_mlops_scoring_client: Starting scoring data frame 23/08/21 14:25:11 INFO h2o_mlops_scoring_client: Scoring complete 23/08/21 14:25:11 INFO h2o_mlops_scoring_client: Total run time: 0:01:13 23/08/21 14:25:11 INFO h2o_mlops_scoring_client: Scoring run time: 0:01:12
- Optionally merge the scores into the original data frame.
DATA_FRAME.merge(pandas_df, on=ID_COLUMN)
ID | target | v1 | v2 | v3 | v4 | v5 | v6 | v7 | v8 | ... | v124 | v125 | v126 | v127 | v128 | v129 | v130 | v131 | target.0 | target.1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | 1 | 1.335739 | 8.727474 | C | 3.921026 | 7.915266 | 2.599278 | 3.176895 | 0.012941 | ... | 0.035754 | AU | 1.804126 | 3.113719 | 2.024285 | 0 | 0.636365 | 2.857144 | 0.116770 | 0.883230 |
1 | 4 | 1 | NaN | NaN | C | NaN | 9.191265 | NaN | NaN | 2.301630 | ... | 0.598896 | AF | NaN | NaN | 1.957825 | 0 | NaN | NaN | 0.298435 | 0.701565 |
2 | 5 | 1 | 0.943877 | 5.310079 | C | 4.410969 | 5.326159 | 3.979592 | 3.928571 | 0.019645 | ... | 0.013452 | AE | 1.773709 | 3.922193 | 1.120468 | 2 | 0.883118 | 1.176472 | 0.154390 | 0.845610 |
3 | 6 | 1 | 0.797415 | 8.304757 | C | 4.225930 | 11.627438 | 2.097700 | 1.987549 | 0.171947 | ... | 0.002267 | CJ | 1.415230 | 2.954381 | 1.990847 | 1 | 1.677108 | 1.034483 | 0.042505 | 0.957495 |
4 | 8 | 1 | NaN | NaN | C | NaN | NaN | NaN | NaN | NaN | ... | NaN | Z | NaN | NaN | NaN | 0 | NaN | NaN | 0.057625 | 0.942375 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
114316 | 228708 | 1 | NaN | NaN | C | NaN | NaN | NaN | NaN | NaN | ... | NaN | AL | NaN | NaN | NaN | 0 | NaN | NaN | 0.108165 | 0.891835 |
114317 | 228710 | 1 | NaN | NaN | C | NaN | NaN | NaN | NaN | NaN | ... | NaN | E | NaN | NaN | NaN | 1 | NaN | NaN | 0.038374 | 0.961626 |
114318 | 228711 | 1 | NaN | NaN | C | NaN | 10.069277 | NaN | NaN | 0.323324 | ... | 0.156764 | Q | NaN | NaN | 2.417606 | 2 | NaN | NaN | 0.053958 | 0.946042 |
114319 | 228712 | 1 | NaN | NaN | C | NaN | 10.106144 | NaN | NaN | 0.309226 | ... | 0.490658 | BW | NaN | NaN | 3.526650 | 0 | NaN | NaN | 0.220766 | 0.779234 |
114320 | 228713 | 1 | 1.619763 | 7.932978 | C | 4.640085 | 8.473141 | 2.351470 | 2.826766 | 3.479754 | ... | 3.135205 | V | 1.943149 | 4.385553 | 1.604493 | 0 | 1.787610 | 1.386138 | 0.129088 | 0.870912 |
114321 rows × 135 columns
Feedback
- Submit and view feedback for this page
- Send feedback about H2O MLOps to cloud-feedback@h2o.ai