Deploying MOJOs to SageMaker with H2O eScorer Standalone
Overview
This guide shows how to deploy an H2O MOJO model to Amazon SageMaker using H2O eScorer Standalone. You will build a Docker image, deploy it as a SageMaker endpoint, and send prediction requests from a notebook or from Amazon S3.
At a high level, you complete the following tasks:
- Package H2O eScorer Standalone into a container image.
- Upload your MOJO model artifact to Amazon S3.
- Create a SageMaker model and endpoint that use the image and model.
- Invoke the endpoint to get predictions.
Before you begin
Before you start, make sure you have:
- An H2O MOJO model file (for example,
pipeline.mojo) that is already packaged and uploaded to Amazon S3. - Access to an AWS account with permission to use Amazon SageMaker, Amazon S3, AWS Secrets Manager, and Amazon ECR.
- A SageMaker notebook instance in the same AWS Region as your S3 bucket and ECR repository.
- Docker and the AWS CLI installed and configured on your local machine to build and push the container image.
- A valid Driverless AI license key stored in AWS Secrets Manager.
If you use AWS managed policies, you can start with the following:
AmazonSageMakerFullAccess: Grants permissions for SageMaker operations.AmazonS3FullAccess: Grants permissions for reading and writing data in S3.AWSSecretsManagerReadOnlyAccess: Grants permissions for reading the license key from Secrets Manager.
Make sure that your IAM role has sufficient permissions to access SageMaker, S3, Secrets Manager, and ECR before you start creating and invoking endpoints.
Step 1: Build and push the H2O eScorer image
In this step, you package H2O eScorer Standalone into a Docker image and push it to Amazon ECR so SageMaker can use it.
You can download the H2O eScorer Standalone JAR file from the downloads page.
Create the entrypoint script
The container entrypoint script checks that the license key is available, writes it to a file, validates the mounted model directory, and then starts H2O eScorer.
The entrypoint.sh script looks like this:
#!/bin/bash
set -e
SAGEMAKER_MOUNT_PATH="/opt/ml/model"
LICENSE_FILE="/app/license.sig"
echo "Starting container initialization..."
if [ -z "$DRIVERLESS_AI_LICENSE_KEY" ]; then
echo "Error: DRIVERLESS_AI_LICENSE_KEY environment variable is missing."
exit 1
fi
echo "Writing license key to $LICENSE_FILE..."
echo "$DRIVERLESS_AI_LICENSE_KEY" > "$LICENSE_FILE"
if [ ! -s "$LICENSE_FILE" ]; then
echo "Error: Failed to write license file."
exit 1
fi
if [ -z "$(ls -A $SAGEMAKER_MOUNT_PATH)" ]; then
echo "Warning: $SAGEMAKER_MOUNT_PATH appears empty."
else
echo "Model files detected in $SAGEMAKER_MOUNT_PATH:"
ls -l "$SAGEMAKER_MOUNT_PATH"
fi
echo "Starting H2O eScorer..."
exec java -Dpropertiesfilename=/app/H2OaiRestServer.properties \
-XX:+UseContainerSupport \
-Dai.h2o.mojos.runtime.license.filename="$LICENSE_FILE" \
-DModelDirectory="$SAGEMAKER_MOUNT_PATH/" \
-Dscoring.batch.size=${SCORING_BATCH_SIZE:-"1000"} \
--add-opens java.base/java.lang=ALL-UNNAMED \
-jar /app/escorer-standalone-*.jar \
--spring.servlet.multipart.max-file-size=${MULTIPART_MAX_FILE_SIZE:-"2048MB"} \
--spring.servlet.multipart.max-request-size=${MULTIPART_MAX_REQUEST_SIZE:-"2048MB"} \
--server.tomcat.max-threads=4
Create the Dockerfile
The Dockerfile installs Java, copies the H2O eScorer JAR and configuration, and sets the container entrypoint.
The Dockerfile looks like this:
FROM openjdk:26-ea-17-slim-trixie
WORKDIR /app
RUN apt-get update && \
rm -rf /var/lib/apt/lists/*
ENV HOME=/app
COPY escorer-standalone-*.jar \
entrypoint.sh \
H2OaiRestServer.properties \
/app/
RUN chmod +x /app/entrypoint.sh
ENV MOJO_LIBS_HOME=/app/.mojo-cache
RUN mkdir -p $MOJO_LIBS_HOME \
&& chmod 777 $MOJO_LIBS_HOME
EXPOSE 8080
ENTRYPOINT ["/app/entrypoint.sh"]
Build and push the image to AWS ECR
Build and push the image from the directory that contains the Dockerfile. Replace <aws-account-id> and <region> with your AWS account ID and Region.
docker build -f Dockerfile -t escorer-sagemaker .
docker tag escorer-sagemaker:latest <aws-account-id>.dkr.ecr.<region>.amazonaws.com/h2o-escorer-sagemaker/scorer:latest
docker push <aws-account-id>.dkr.ecr.<region>.amazonaws.com/h2o-escorer-sagemaker/scorer:latest
Step 2: Set up SageMaker from a notebook
In this step, you use a SageMaker notebook to create the model, endpoint configuration, and endpoint.
Ensure that the IAM role used by the notebook has permission to access SageMaker by attaching the policy AmazonSageMakerFullAccess.
Open a SageMaker notebook in the same Region as your S3 bucket and ECR repository. Then run the following cell to import libraries and create AWS clients:
import boto3
import sagemaker
role = sagemaker.get_execution_role()
region = boto3.Session().region_name
bucket = sagemaker.Session().default_bucket()
sm = boto3.client("sagemaker")
runtime = boto3.client("runtime.sagemaker")
s3 = boto3.client("s3")
secretsm = boto3.client("secretsmanager")
Ensure that the IAM user has read permissions to AWS Secrets Manager by attaching the policy AWSSecretsManagerClientReadOnlyAccess.
Next, define names and paths for the SageMaker resources and the model file in S3:
model_name = "h2o-model-riskmodel-mojo"
endpoint_config = "h2o-endpoint-config-riskmodel-mojo"
endpoint_name = "h2o-model-endpoint-riskmodel-mojo"
license_secret_id = "h2o-model-dai-license"
model_s3_url = "s3://h2o-escorer/models/v1/riskmodel.tar.gz"
The following variables are used in the example:
model_name: The SageMaker model name.endpoint_name: The name of the SageMaker endpoint you call for predictions.endpoint_config: The name of the endpoint configuration.license_secret_id: The name of the secret in AWS Secrets Manager that stores your Driverless AI license key.model_s3_url: The S3 URI where the packaged MOJO model is stored.
Step 3: Create the SageMaker model and endpoint
When you deploy the model, you can control how H2O eScorer formats the output by setting environment variables:
DRIVERLESS_AI_LICENSE_KEY(required): The license key used by H2O eScorer. In this guide, it is read from AWS Secrets Manager.H2OAI_BATCH_SHAPLEY(optional): When set totrue, includes Shapley values (feature attributions) in CSV output. Defaults tofalse.H2OAI_BATCH_HEADER(optional): When set totrue, includes a header row with column names in CSV output. Defaults tofalse.H2OAI_BATCH_OUTPUTALL(optional): When set totrue, returns all model outputs instead of only the primary prediction. Defaults tofalse.
The following container definition tells SageMaker which image to run, where to find the model in S3, and which environment variables to set. Replace <aws-account-id> and <region> with your actual AWS account ID and region:
hosting_container = {
"Image": "<aws-account-id>.dkr.ecr.<region>.amazonaws.com/h2o-escorer-sagemaker/scorer:latest",
"ModelDataUrl": model_s3_url,
"Environment": {
"DRIVERLESS_AI_LICENSE_KEY":
secretsm.get_secret_value(SecretId=license_secret_id)['SecretString'],
"H2OAI_BATCH_HEADER": "false",
"H2OAI_BATCH_OUTPUTALL": "false",
"H2OAI_BATCH_SHAPLEY": "false"
}
}
create_model_response = sm.create_model(
ModelName=model_name,
ExecutionRoleArn=role,
PrimaryContainer=hosting_container,
)
create_model_response["ModelArn"]
This call creates the SageMaker model and returns the model's Amazon Resource Name (ARN).
Next, define the endpoint configuration. Here you choose the instance type and the number of instances. For instance types and pricing, see the Amazon SageMaker pricing page.
create_endpoint_config_response = sm.create_endpoint_config(
EndpointConfigName=endpoint_config,
ProductionVariants=[
{
"InstanceType": "ml.m4.xlarge",
"InitialInstanceCount": 1,
"ModelName": model_name,
"VariantName": "AllTraffic",
}
],
)
print(create_endpoint_config_response["EndpointConfigArn"])
This configuration is then used to create the endpoint.
create_endpoint_response = sm.create_endpoint(
EndpointName=endpoint_name, EndpointConfigName=endpoint_config
)
print(f"Creating endpoint: {create_endpoint_response['EndpointArn']}")
sm.get_waiter("endpoint_in_service").wait(EndpointName=endpoint_name)
resp = sm.describe_endpoint(EndpointName=endpoint_name)
print(f"Endpoint ARN: {resp['EndpointArn']}")
print(f"Status: {resp['EndpointStatus']}")
if resp["EndpointStatus"] != "InService":
raise Exception("Endpoint creation did not succeed")
If the status is not InService, check the SageMaker console for error messages and verify that the role used by the notebook can access SageMaker, S3, ECR, and Secrets Manager.
Step 4: Send prediction requests to the endpoint
You can score the model using two different methods: sending requests directly from a SageMaker notebook, or reading input data from Amazon S3 and writing predictions back to S3.
Score data from a SageMaker notebook
To quickly test the endpoint, send a single row as a JSON request:
import json
data = {
"name": "pipeline.mojo",
"explainability": "false",
"data": "5000,36 months,10.65,162.87,10,35000,VERIFIED - income,FL,13.92,,3418,30.5,23"
}
response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType='application/json;charset=UTF-8',
Body=json.dumps(data)
)
print(json.dumps(json.loads(response['Body'].read().decode()), indent=4))
To score multiple rows at once, send them as a list of rows in the JSON body:
data = {
"name": "pipeline.mojo",
"explainability": "false",
"data": {
"rows": [
"5000,36 months,10.65,162.87,10,24000,VERIFIED - income,AZ,27.65,2,13648,83.7,9",
"2500,60 months,15.27,59.83,0,30000,verified,GA,1,,1687,9.4,4",
"2400,36 months,15.96,84.33,10,12252,not verified,IL,8.72,,2956,98.5,10",
"10000,36 months,13.49,339.31,10,49200,verified,CA,20,,5598,21,3",
]
}
}
response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType='application/json;charset=UTF-8',
Body=json.dumps(data)
)
print(json.dumps(json.loads(response['Body'].read().decode()), indent=4))
If your data is stored in a CSV file, you can read it in the notebook and send it as CSV.
import csv
csv_file_path = "LC_DAI_Small.csv"
csv_data = []
with open(csv_file_path, newline='') as csvfile:
reader = csv.reader(csvfile)
next(reader, None) # Skip header
for row_values in reader:
csv_data.append(",".join(row_values))
csv_body = "\n".join(csv_data)
response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType='text/csv', # Sending CSV
Body=csv_body
)
print(response['Body'].read().decode('utf-8'))
Score data from Amazon S3
Use this option if your input CSV file is already stored in S3 and you want to save the predictions back to S3.
Ensure your role has permission to read from the input bucket and write to the output bucket, by attching the AmazonS3FullAccess policy.
Add the following to the notebook:
s3_input_path = "s3://<YOUR_INPUT_BUCKET>/path/to/file.csv"
s3_output_path = "s3://<YOUR_OUTPUT_BUCKET>/path/to/file.csv"
bucket_name = s3_input_path.split("/")[2]
key = "/".join(s3_input_path.split("/")[3:])
response = s3.get_object(Bucket=bucket_name, Key=key)
csv_body = response["Body"].read().decode("utf-8")
csv_lines = csv_body.strip().split("\n")
header = csv_lines[0]
data_lines = "\n".join(csv_lines[1:])
response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType="text/csv",
Accept="text/csv",
Body=data_lines,
)
predictions_csv = response["Body"].read().decode("utf-8")
output_bucket = s3_output_path.split("/")[2]
output_key = "/".join(s3_output_path.split("/")[3:])
s3.put_object(Bucket=output_bucket, Key=output_key, Body=predictions_csv)
Step 5: Clean up resources
When you finish testing, delete the endpoint and related resources to avoid ongoing charges:
sm.delete_endpoint(EndpointName=endpoint_name)
sm.delete_endpoint_config(EndpointConfigName=endpoint_config)
sm.delete_model(ModelName=model_name)
The H2O eScorer service runs on port 8080 inside the container, and SageMaker exposes the endpoint over HTTPS. Start with a smaller instance type in the endpoint configuration, and adjust instance type and count based on latency and throughput needs.
- Submit and view feedback for this page
- Send feedback about H2O eScorer to cloud-feedback@h2o.ai