Skip to main content

Installation

Distribution

The application is provided as a single JAR file. A properties file is generated so that you can make use of environment-specific options.

Prerequisites

  • JDK 17
  • Driverless AI model (pipeline.mojo) or H2O-3 (mojo.zip) model
  • If you are using Driverless AI models, a Driverless AI license
  • REST server distribution

Hardware sizing

The minimum CPU requirement is 4 CPUs (either physical or vcores). It is recommended that a single CPU be able to perform 1,000 predictions a second for a simple model. Memory is used to cache the models that are being used for scoring.

The total size of the model on disk can be used to estimate memory requirements:

  1. Sum in MB the size of all models that will be used for scoring
  2. Multiply that by 2x the number of CPUs
  3. Add 1024MB for the system for IO/OS pages

For example:

1 x Mojo = 4MB on a 4 x CPU system
4 * (2x4) + 1024 = 1056MB

Round this to the common system sizes for example (8GB, 16GB, 32GB, etc)

info

Setting the parameter unloadmodels to true (default is false) will cause the model with the lowest invocation count to be flushed from memory.

The disk space used to store the runtime is approximately 170MB. However, optional logs can be enabled for Scoring, Auditing and Monitoring. For the specific parameters, see Configuration.

Additionally, collecting the standard out and error logs is a good practice for the runtime. These can be saved locally or exported with the standard SysLogNG configurations.

Load Balancing

  • Models and REST server instances are stateless. It is a best practice that production-level instances have the appropriate number of servers/instances to gracefully handle a failure, whether hardware or software.

  • Most load balancers perform a ‘health’ check on the instances to ensure the instance is available, and the REST server has a ‘ping’ URL that can be called by the load balancer to verify availability.

  • Additionally, the monitoring functions in the REST Server can be used to check the operation of the instance. For more information, see Monitoring.


Feedback