Tutorial 1B: Batch scoring with the Python client
This tutorial showcases the use of H2O eScorer Python client to run the Batch Scorer. Batch scoring is built to read, score and write large amounts of datasets from storage. An example of a storage is AWS S3. In this tutorial, we will use a properties
file with the eScorer Python client to score a .csv
dataset from an AWS S3 bucket, and write results back to the same bucket.
Install the Python client
-
You can download the H2O eScorer Python client wheel from the Python client tab in the H2O eScorer downloads page.
-
In your Python environment, run the following command to install the package and its dependencies:
pip install <python-client-wheel-name>
Authentication
H2O eScorer environment variables for authentication are set just once, and can be automatically used by the client for as many runs as you want.
For more information about authenticating the Python client, see Python client overview: Authentication
In your Python environment, run the following to set the environment variables for the BE service and Keycloak service account:
export HAIC_ESCORER_URL = "https://rest..."
export HAIC_AUTH_URL = "https://auth..."
export HAIC_ESCORER_URL = "..."
export HAIC_ESCORER_URL = "..."
export HAIC_ESCORER_URL = "..."
Score
In H2O eScorer, batch scoring can be performed easily with the properties
file.
For information on how to autogenerate and populate a properties
file to configure batch scoring, see Batch scoring configuration and usage.
import h2o_escorer
client = h2o_escorer.Client()
response = await client.batch_scorer(
model_name="riskmodel.mojo",
properties_filepath="s3_scorer.properties",
)
The batch_scorer
method takes the model name and the properties file path as arguments. The properties
file contains the configuration for the batch scoring job. A Python dictionary is returned as a response.
The logs can be obtained by accessing the result
key of the response object.
response["result"]
To get the time elapsed for the batch scoring job, access the time_elapsed
key of the response object.
response["time_elapsed"]
ModelStats
H2O eScorer Wave app provides a live dashboard with real-time updates to view ModelStats as models are scored in batch.
- Submit and view feedback for this page
- Send feedback about H2O eScorer to cloud-feedback@h2o.ai