Skip to main content
Version: v1.1.x

Deploying MOJOs to SageMaker with H2O eScorer Standalone

Overview

This guide shows how to deploy an H2O MOJO model to Amazon SageMaker using H2O eScorer Standalone. You will build a Docker image, deploy it as a SageMaker endpoint, and send prediction requests from a notebook or from Amazon S3.

At a high level, you complete the following tasks:

  1. Package H2O eScorer Standalone into a container image.
  2. Upload your MOJO model artifact to Amazon S3.
  3. Create a SageMaker model and endpoint that use the image and model.
  4. Invoke the endpoint to get predictions.

Before you begin

Before you start, make sure you have:

  • An H2O MOJO model file (for example, pipeline.mojo) that is already packaged and uploaded to Amazon S3.
  • Access to an AWS account with permission to use Amazon SageMaker, Amazon S3, AWS Secrets Manager, and Amazon ECR.
  • A SageMaker notebook instance in the same AWS Region as your S3 bucket and ECR repository.
  • Docker and the AWS CLI installed and configured on your local machine to build and push the container image.
  • A valid Driverless AI license key stored in AWS Secrets Manager.

If you use AWS managed policies, you can start with the following:

  • AmazonSageMakerFullAccess: Grants permissions for SageMaker operations.
  • AmazonS3FullAccess: Grants permissions for reading and writing data in S3.
  • AWSSecretsManagerReadOnlyAccess: Grants permissions for reading the license key from Secrets Manager.
info

Make sure that your IAM role has sufficient permissions to access SageMaker, S3, Secrets Manager, and ECR before you start creating and invoking endpoints.

Step 1: Build and push the H2O eScorer image

In this step, you package H2O eScorer Standalone into a Docker image and push it to Amazon ECR so SageMaker can use it.

You can download the H2O eScorer Standalone JAR file from the downloads page.

Create the entrypoint script

The container entrypoint script checks that the license key is available, writes it to a file, validates the mounted model directory, and then starts H2O eScorer.

The entrypoint.sh script looks like this:

#!/bin/bash
set -e

SAGEMAKER_MOUNT_PATH="/opt/ml/model"

LICENSE_FILE="/app/license.sig"

echo "Starting container initialization..."

if [ -z "$DRIVERLESS_AI_LICENSE_KEY" ]; then
echo "Error: DRIVERLESS_AI_LICENSE_KEY environment variable is missing."
exit 1
fi

echo "Writing license key to $LICENSE_FILE..."
echo "$DRIVERLESS_AI_LICENSE_KEY" > "$LICENSE_FILE"

if [ ! -s "$LICENSE_FILE" ]; then
echo "Error: Failed to write license file."
exit 1
fi

if [ -z "$(ls -A $SAGEMAKER_MOUNT_PATH)" ]; then
echo "Warning: $SAGEMAKER_MOUNT_PATH appears empty."
else
echo "Model files detected in $SAGEMAKER_MOUNT_PATH:"
ls -l "$SAGEMAKER_MOUNT_PATH"
fi

echo "Starting H2O eScorer..."

exec java -Dpropertiesfilename=/app/H2OaiRestServer.properties \
-XX:+UseContainerSupport \
-Dai.h2o.mojos.runtime.license.filename="$LICENSE_FILE" \
-DModelDirectory="$SAGEMAKER_MOUNT_PATH/" \
-Dscoring.batch.size=${SCORING_BATCH_SIZE:-"1000"} \
--add-opens java.base/java.lang=ALL-UNNAMED \
-jar /app/escorer-standalone-*.jar \
--spring.servlet.multipart.max-file-size=${MULTIPART_MAX_FILE_SIZE:-"2048MB"} \
--spring.servlet.multipart.max-request-size=${MULTIPART_MAX_REQUEST_SIZE:-"2048MB"} \
--server.tomcat.max-threads=4

Create the Dockerfile

The Dockerfile installs Java, copies the H2O eScorer JAR and configuration, and sets the container entrypoint.

The Dockerfile looks like this:

FROM openjdk:26-ea-17-slim-trixie

WORKDIR /app

RUN apt-get update && \
rm -rf /var/lib/apt/lists/*

ENV HOME=/app

COPY escorer-standalone-*.jar \
entrypoint.sh \
H2OaiRestServer.properties \
/app/

RUN chmod +x /app/entrypoint.sh

ENV MOJO_LIBS_HOME=/app/.mojo-cache
RUN mkdir -p $MOJO_LIBS_HOME \
&& chmod 777 $MOJO_LIBS_HOME

EXPOSE 8080

ENTRYPOINT ["/app/entrypoint.sh"]

Build and push the image to AWS ECR

Build and push the image from the directory that contains the Dockerfile. Replace <aws-account-id> and <region> with your AWS account ID and Region.

docker build -f Dockerfile -t escorer-sagemaker .

docker tag escorer-sagemaker:latest <aws-account-id>.dkr.ecr.<region>.amazonaws.com/h2o-escorer-sagemaker/scorer:latest

docker push <aws-account-id>.dkr.ecr.<region>.amazonaws.com/h2o-escorer-sagemaker/scorer:latest

Step 2: Set up SageMaker from a notebook

In this step, you use a SageMaker notebook to create the model, endpoint configuration, and endpoint. Ensure that the IAM role used by the notebook has permission to access SageMaker by attaching the policy AmazonSageMakerFullAccess.

Open a SageMaker notebook in the same Region as your S3 bucket and ECR repository. Then run the following cell to import libraries and create AWS clients:

import boto3
import sagemaker

role = sagemaker.get_execution_role()
region = boto3.Session().region_name

bucket = sagemaker.Session().default_bucket()

sm = boto3.client("sagemaker")
runtime = boto3.client("runtime.sagemaker")
s3 = boto3.client("s3")
secretsm = boto3.client("secretsmanager")
note

Ensure that the IAM user has read permissions to AWS Secrets Manager by attaching the policy AWSSecretsManagerClientReadOnlyAccess.

Next, define names and paths for the SageMaker resources and the model file in S3:

model_name = "h2o-model-riskmodel-mojo"
endpoint_config = "h2o-endpoint-config-riskmodel-mojo"
endpoint_name = "h2o-model-endpoint-riskmodel-mojo"
license_secret_id = "h2o-model-dai-license"
model_s3_url = "s3://h2o-escorer/models/v1/riskmodel.tar.gz"

The following variables are used in the example:

  • model_name: The SageMaker model name.
  • endpoint_name: The name of the SageMaker endpoint you call for predictions.
  • endpoint_config: The name of the endpoint configuration.
  • license_secret_id: The name of the secret in AWS Secrets Manager that stores your Driverless AI license key.
  • model_s3_url: The S3 URI where the packaged MOJO model is stored.

Step 3: Create the SageMaker model and endpoint

When you deploy the model, you can control how H2O eScorer formats the output by setting environment variables:

  • DRIVERLESS_AI_LICENSE_KEY (required): The license key used by H2O eScorer. In this guide, it is read from AWS Secrets Manager.
  • H2OAI_BATCH_SHAPLEY (optional): When set to true, includes Shapley values (feature attributions) in CSV output. Defaults to false.
  • H2OAI_BATCH_HEADER (optional): When set to true, includes a header row with column names in CSV output. Defaults to false.
  • H2OAI_BATCH_OUTPUTALL (optional): When set to true, returns all model outputs instead of only the primary prediction. Defaults to false.

The following container definition tells SageMaker which image to run, where to find the model in S3, and which environment variables to set. Replace <aws-account-id> and <region> with your actual AWS account ID and region:

hosting_container = {
"Image": "<aws-account-id>.dkr.ecr.<region>.amazonaws.com/h2o-escorer-sagemaker/scorer:latest",
"ModelDataUrl": model_s3_url,
"Environment": {
"DRIVERLESS_AI_LICENSE_KEY":
secretsm.get_secret_value(SecretId=license_secret_id)['SecretString'],
"H2OAI_BATCH_HEADER": "false",
"H2OAI_BATCH_OUTPUTALL": "false",
"H2OAI_BATCH_SHAPLEY": "false"
}
}

create_model_response = sm.create_model(
ModelName=model_name,
ExecutionRoleArn=role,
PrimaryContainer=hosting_container,
)

create_model_response["ModelArn"]

This call creates the SageMaker model and returns the model's Amazon Resource Name (ARN).

Next, define the endpoint configuration. Here you choose the instance type and the number of instances. For instance types and pricing, see the Amazon SageMaker pricing page.

create_endpoint_config_response = sm.create_endpoint_config(
EndpointConfigName=endpoint_config,
ProductionVariants=[
{
"InstanceType": "ml.m4.xlarge",
"InitialInstanceCount": 1,
"ModelName": model_name,
"VariantName": "AllTraffic",
}
],
)
print(create_endpoint_config_response["EndpointConfigArn"])

This configuration is then used to create the endpoint.

create_endpoint_response = sm.create_endpoint(
EndpointName=endpoint_name, EndpointConfigName=endpoint_config
)
print(f"Creating endpoint: {create_endpoint_response['EndpointArn']}")

sm.get_waiter("endpoint_in_service").wait(EndpointName=endpoint_name)

resp = sm.describe_endpoint(EndpointName=endpoint_name)
print(f"Endpoint ARN: {resp['EndpointArn']}")
print(f"Status: {resp['EndpointStatus']}")

if resp["EndpointStatus"] != "InService":
raise Exception("Endpoint creation did not succeed")
tip

If the status is not InService, check the SageMaker console for error messages and verify that the role used by the notebook can access SageMaker, S3, ECR, and Secrets Manager.

Step 4: Send prediction requests to the endpoint

You can score the model using two different methods: sending requests directly from a SageMaker notebook, or reading input data from Amazon S3 and writing predictions back to S3.

Score data from a SageMaker notebook

To quickly test the endpoint, send a single row as a JSON request:

import json

data = {
"name": "pipeline.mojo",
"explainability": "false",
"data": "5000,36 months,10.65,162.87,10,35000,VERIFIED - income,FL,13.92,,3418,30.5,23"
}

response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType='application/json;charset=UTF-8',
Body=json.dumps(data)
)

print(json.dumps(json.loads(response['Body'].read().decode()), indent=4))

To score multiple rows at once, send them as a list of rows in the JSON body:

data = {
"name": "pipeline.mojo",
"explainability": "false",
"data": {
"rows": [
"5000,36 months,10.65,162.87,10,24000,VERIFIED - income,AZ,27.65,2,13648,83.7,9",
"2500,60 months,15.27,59.83,0,30000,verified,GA,1,,1687,9.4,4",
"2400,36 months,15.96,84.33,10,12252,not verified,IL,8.72,,2956,98.5,10",
"10000,36 months,13.49,339.31,10,49200,verified,CA,20,,5598,21,3",
]
}
}

response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType='application/json;charset=UTF-8',
Body=json.dumps(data)
)

print(json.dumps(json.loads(response['Body'].read().decode()), indent=4))
info

If your data is stored in a CSV file, you can read it in the notebook and send it as CSV.

import csv

csv_file_path = "LC_DAI_Small.csv"

csv_data = []
with open(csv_file_path, newline='') as csvfile:
reader = csv.reader(csvfile)
next(reader, None) # Skip header
for row_values in reader:
csv_data.append(",".join(row_values))

csv_body = "\n".join(csv_data)

response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType='text/csv', # Sending CSV
Body=csv_body
)

print(response['Body'].read().decode('utf-8'))

Score data from Amazon S3

Use this option if your input CSV file is already stored in S3 and you want to save the predictions back to S3. Ensure your role has permission to read from the input bucket and write to the output bucket, by attching the AmazonS3FullAccess policy.

Add the following to the notebook:

s3_input_path = "s3://<YOUR_INPUT_BUCKET>/path/to/file.csv"
s3_output_path = "s3://<YOUR_OUTPUT_BUCKET>/path/to/file.csv"

bucket_name = s3_input_path.split("/")[2]
key = "/".join(s3_input_path.split("/")[3:])
response = s3.get_object(Bucket=bucket_name, Key=key)
csv_body = response["Body"].read().decode("utf-8")

csv_lines = csv_body.strip().split("\n")
header = csv_lines[0]
data_lines = "\n".join(csv_lines[1:])

response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType="text/csv",
Accept="text/csv",
Body=data_lines,
)

predictions_csv = response["Body"].read().decode("utf-8")

output_bucket = s3_output_path.split("/")[2]
output_key = "/".join(s3_output_path.split("/")[3:])
s3.put_object(Bucket=output_bucket, Key=output_key, Body=predictions_csv)

Step 5: Clean up resources

When you finish testing, delete the endpoint and related resources to avoid ongoing charges:

sm.delete_endpoint(EndpointName=endpoint_name)
sm.delete_endpoint_config(EndpointConfigName=endpoint_config)
sm.delete_model(ModelName=model_name)
info

The H2O eScorer service runs on port 8080 inside the container, and SageMaker exposes the endpoint over HTTPS. Start with a smaller instance type in the endpoint configuration, and adjust instance type and count based on latency and throughput needs.


Feedback