Skip to main content

Batch scoring

This example demonstrates how to use the eScorer Python client for batch scoring. You can download the complete example here.

Import the H2O eScorer client

import h2o_escorer 
import asyncio

Async function main to authenticate and score

The main async function authenticates a client and submits a batch scoring job using the H2O eScorer. It waits for the job to complete, retrieves the job logs, and then closes the client connection. The function returns the logs after the job finishes.

async def main():
client = Client()

batch_job = await client.batch_scorer(
model_name='<model_name>',
properties_filepath='/path/to/file.properties',
verbose=True,
)

print(f'Batch job ID: {batch_job.id}')

while not await batch_job.is_complete():
await asyncio.sleep(2)

job_logs = await batch_job.get_logs()

await client.close()

return job_logs

Asynchronously call score

logs = asyncio.run(main())
Batch job ID: 09e5042c-096a-4ee7-9bd6-182ef7457298

Get the logs

for line in logs:
print(line)
****** AWS *******
Thread-3 BAD DATA Too Many Features ROW: = 31487,3500, 36 months,7.74,109.27,9,31200,"Not,Verified",MN,5.73,,1092,9.9,12 Len:15 Features:13 Offset: 1
Thread-3 Model has 13 features but after parsing feature count is 15 check field seperator or maybe the data contains a default seperator.
Total selected rows 39029 Total Read time (ms) 16056
Thread-1 Rows Read 19644 Scored 19644 Error 0 Queue Empty true
Thread-3 Rows Read 19385 Scored 19384 Error 1 Queue Empty true
Upload of file escorer/predictions-2024-05-28-07-51-58.csv to S3 completed

Feedback