Understand deployments
In H2O MLOps, deployments are created when model version(s) are served for scoring. Model endpoint security, artifact type, runtime, and Kubernetes options can be configured when deploying a model.
H2O MLOps supports different deployment modes:
-
Real-time deployments: Make a model available as a live REST endpoint that returns predictions immediately when given input data. Types of real-time deployments include:
- Single model deployments: Serve one model version at a time.
- A/B test deployments: Compare the performance of two or more models in production.
- Champion/Challenger deployments: Continuously compare a Champion model against one or more Challenger models to promote the best performer.
-
Batch scoring deployments: Run model scoring jobs on batches of data instead of serving predictions in real time.
You can create and manage deployments using:
- The H2O MLOps UI on H2O AI Cloud.
- The H2O MLOps Python client, which allows you to automate deployment tasks from a Python application.
To learn more about deployments, refer to the following pages:
- Create a deployment
- View deployments
- Scoring runtimes
- Vertical Pod Autoscaler (VPA) support
- Pod Disruption Budget (PDB)
To learn more about deployments using the Python client, see:
Feedback
- Submit and view feedback for this page
- Send feedback about H2O MLOps to cloud-feedback@h2o.ai