Release notes
Version 0.69.1 (Jan , 2025)
This release includes a new feature.
New features
- [Runtimes] Added support for the Driverless AI runtime version 1.11.1.1.
Version 0.69.0 (Dec 19, 2024)
This release includes new features and fixes.
New features
- [Runtimes] Added support for MLflow Model Scorer runtime for Python 3.11 and 3.12, and Dynamic MLflow Model Scorer runtime for Python 3.12.
- [Runtimes] Added support for H2O Driverless AI runtime version 1.10.7.3.
- [Runtimes] Exposed Kubernetes readiness probe on deployments.
- [UI] Replaced bcrypt with PBKDF2 hashing when creating deployments with the
Passphrase (Stored Hashed)
security option. - [Python Client] Added support for SSL settings via the Client constructor.
- [Python Client] Added support for PBKDF2 hashed security option.
- [Deployer] Introduced PBKDF2-based passphrase hashing for improved security.
- [Deployer] Added support for Generic Ephemeral Volumes in the Runtimes.
- [Deployer] Introduced
/readyz
readiness probe endpoint for dynamically deployed runtimes. - [Deployer] Introduced a pod disruption budget for enhanced stability.
- [Helm] Enabled JVM config passing to the monitoring proxy.
- [Helm] Added support for configuring limits and JVM settings in the deployer.
- [Helm] Defined
MLOPS_WAVE_APP_URL
as an environment variable for better configuration.
Fixes
- [UI] Ensured UI accessibility even when listing deployments fails.
- [UI] Fixed the issue where signing in from the access denied page resulted in an
Missing parameters: id_token_hint
error. - [Monitor Proxy] Stopped sending
TransactionTransmission
events to downstream transmitters whenenableTransaction
isfalse
. - [Python Client] Added
ValueError
for missing or invalid protocol in the gateway URL. - [Python Client] Configured SSL settings for the functions
get_capabilities
,get_sample_request
, andget_schema
since they access deployment endpoints. - [Storage] Removed storage cleanup cron job and implemented it within a thread of storage itself.
- [Helm] Ensured proper RBAC configuration when multiple groups are specified.
- [Helm] Removed legacy
LOCAL
andMIGRATE
mode code. - [Runtimes] Fixed memory leak in MOJO2 runtimes by upgrading the internal MOJO2 library.
- [Deployer] Ensured that stale deployments will be redeployed.
- [Deployer] Skipped routing migration in cases of errors not related to deployment migration.
- [Deployer] Used response header modifier instead of request header modifier for CORS.
- [Deployer] Added configurable Kubernetes client timeout for better performance and reliability.
Version 0.68.0 (Nov 05, 2024)
This release includes new features and fixes.
New features
- [UI] Enabled model and model version deletion.
- [UI] Enabled to use the default deployment security option from the backend.
- [UI] Added support for H2O Driverless AI runtime versions 1.10.7.2 and 1.11.1.
- [Python Client] Introduced a timeout parameter (default: 5 seconds) for MLOpsScoringDeployment's methods:
get_capabilities
,get_sample_request
, andget_schema
. - [Python Client] Added support for creating deployment with token-based authentication as a security option.
- [Python Client] Enabled model deletion.
- [Python Client] Enabled the option to unregister an experiment from a model.
- [Python Client] Introduced the
disabled_security
option to manage deployments with No-Security. - [Storage] Storage only supports blob storage from this release onwards. A one-time migrator job was introduced to migrate all the storage data from K8S PVC to blob storage to support seamless upgrades for users.
- [Telemetry] The MLOps-Telemetry component is no longer running as a cron job; it is now a long-running microservice that publishes event data at scheduled intervals.
- [Helm] MLOps storage can be configured to use blob storages from any of the 3 main clouds AWS, Azure and GCP. Minio is also supported for on-premise installations.
- [Helm] Added
H2O_SCORER_MODEL_LOADING_MODE
set to "subprocess" across all MLOps Python-based runtimes. - [Helm] Introduced a migration job for transferring persistent storage to cloud platforms, now supporting Minio and Azure Blob.
- [Helm] Introduced a
SCHEDULER_INTERVAL_SECONDS
environment variable to configure the interval of mlops-telemetry events publishing. - [Deployer] Introduced Vertical Pod Autoscaling (VPA) support.
- [Deployer] Exposed easy access to the security options available in the cluster.
- [Deployer] Restructured environment security options:
- Activated security options list
- Configurable default security option
- [Deployer] Introduced the No-Security option.
Fixes
- [UI] Resolved an error occurring when attempting to view experiment details for experiments with missing metadata.
- [UI] Made the maximum selectable count for deployment replicas configurable.
- [UI] Removed support for MLflow Model Scorer and Dynamic Model Scorers for Python 3.8.
- [UI] Removed support for HT Flexible Runtimes for Python 3.8, including both GPU and CPU variants.
- [Python Client] Improved handling of missing deployment attributes (security and monitor) in backend responses.
- [Python Client] Upgraded the minimum supported Python version to 3.9.
- [gRPC Gateway] Updated
/healthz
to return a 200 status if at least one health check passes, fixing an issue where the gateway would restart if any service was unhealthy. - [Helm] Removed Python 3.8 support for HT and MLFlow runtimes.
- [Helm] Removed the
EnableUserExternalIDUpdate
environment variable from storage for simpler configuration. - [Helm] Added a
-job
suffix to theapp.kubernetes.io/
component label for the monitoring backend job to improve component labeling. - [Helm] Updated rclone configurations to enhance compatibility with Google Cloud Storage (GCS).
- [Helm] Set the telemetry service’s replica count to one to optimize resource usage.
- [Helm] Changed the telemetry scheduler’s default interval to 300 seconds for more efficient scheduling.
- [Storage]
IDP_ID
(i.e. keycloak/ Okta ID) is now used as the primary key for the Users table in MLOps Storage. The username is also not a unique field anymore. Existing user data will be migrated accordingly by the Storage itself when it's spinning up. [Deployer] Only "internal" grpc status are now logged at the ERROR level.
Version 0.67.4 (Oct 10, 2024)
This release includes various fixes.
Fixes
- [Helm] Gateway creation is now skipped when
Values.gatewayApi.create
is set to false. - [Helm] You can now specify extra ingress for Influx.
Version 0.67.3 (Oct 01, 2024)
This release includes new features and fixes.
New features
- [Runtimes] Added support for the Driverless AI 1.11.1 Python scoring pipeline.
Fixes
- [Security] Fixed critical vulnerabilities on Java-based rest scorer and monitoring proxy.
- [Helm] Ensure that registry specification on each image has higher priority over the global image registry configuration.
Version 0.67.2 (Sep 19, 2024)
This release includes new features and fixes.
New features
- [Runtimes] Added support for the Driverless AI 1.10.7.2 Python scoring pipeline.
Fixes
- [Helm] Removed hard-coded dev/vorvan prefix.
- [Helm] Influx network policy was missing a specific label which lead to cleanup job not running.
Version 0.67.1 (Sep 13, 2024)
This release includes various fixes.
Fixes
- [Monitoring Backend] Updated Dockerfile to use numerical user ID, preventing false warnings in systems that check for root access.
- [Drift] Fixed an issue where the worker image could not find the
datatable
dependency.
Version 0.67.0 (Sep 02, 2024)
This release includes various vulnerability fixes.
New features
- [Monitor Proxy] Per project monitoring data retention period can be set for Influx DB during the MLOps installation or upgrade.
- [UI] Added the functionality to log out from the Wave app.
- [UI] Added support for new HT Flexible Runtimes for Python
3.10
, including GPU and CPU variants. - [UI] Added support for DAI runtime versions
1.10.6.3
,1.10.7.1
, and1.11.0
. - [UI] Added support for MLflow Model Scorer and Dynamic Model Scorers for Python
3.10
and3.11
. - [Deployer] Added support for token based authentication for deployments.
- [Runtimes] Added support for DAI runtime versions
1.10.6.3
,1.10.7.1
, and1.11.0
. - [Runtimes] Added support for new HT Flexible Runtimes for Python
3.10
. - [Helm Chart] Added component configuration support for applying tolerations, node selectors, and affinity settings to cron jobs.
- [Helm Chart] Added CA certificate support to the API Gateway deployment.
- [Helm Chart] Replaced Ambassador with Gateway API due to the removal of Emissary.
Fixes
- [UI] Removed the functionality for importing models from external model repositories.
- [UI] Removed the ability to upload experiments as serialised Python (
.pkl
/.pickle
) files. - [UI] Disallowed the creation of tags with commas.
- [UI] Reduced the timeout for notification bars.
- [UI] Fixed the issue where a red cross appeared when registering a model shortly after creating an experiment.
- [UI] Removed support for DAI Python runtimes for
1.10.4.3
and older versions. - [UI] Removed support for MLFlow Model Scorer for Python
3.6
and Python3.7
. - [Runtimes] Removed support for DAI Python runtimes for
1.10.4.3
and older versions. - [Python Client] The
external_registry
package has been removed. - [Runtimes] Fix critical vulnerabilities in all runtimes except DAI Python based one.
- [Runtimes] Fix critical and high vulnerabilities in rest scorer.
- [Helm Chart] Corrected the nodeSelector YAML formatting.
- [Helm Chart] Renamed environment variable
STORAGE_URL
toAPI_GATEWAY_URL
in the Wave app. - [Helm Chart] Updated
H2O_WAVE_POST_REDIRECT_URL
to resolve "Page Not Found" errors when logging out from the Wave app. - [Helm Chart] Updated the Wave app secret
H2O_WAVE_OIDC_END_SESSION_URL
for improved logout functionality. - [Helm Chart] The
enable_user_externalid_update
setting is now configurable. - [Helm Chart] Exposed resource requests and limits for monitoring-drift, model-ingest, and api-gateway components.
Version 0.66.1
This release includes various vulnerability fixes.
New features
-
Released Base Python Scorer v1.2.0 (BYOM).
-
Released Python based runtimes v1.2.0 (BYOM).
-
Released HT runtime v1.2.0 (BYOM).
-
Released MLflow runtime v1.2.0 (BYOM).
Version 0.66.0 (June 04, 2024)
This release includes new features, improvements, bug fixes, and security improvements.
Announcement
This version of H2O MLOps adds optional role-based access control (RBAC). This feature relies on two RBAC configurations: one for the front end (FE) and the other for the back end (BE). When using RBAC, both configurations must be set up identically to ensure proper functionality and a seamless user experience.
The following is an illustrative configuration for both FE and BE RBAC. Note that in this example, it is assumed that "admin" is included in the user access token groups claim. However, it is important to customize this configuration based on your specific requirements at the time of deployment.
apiGateway:
config:
# -- Log verbosity.
logLevel: "debug"
authorization:
# -- Whether authorization is enabled.
enabled: true
# -- JWT claim key which contains the role/group information.
userJwtClaim: "groups"
# -- List of role/group values that should have access to MLOps API Gateway as an array.
allowedUserRoles: ["admin"]
waveApp:
authorization:
# -- Whether authorization is enabled.
enabled: true
# -- JWT claim key which contains the role/group information.
userJwtClaim: "groups"
# -- Comma separated list of role/group values that should have access to MLOps FE.
allowedUserRoles: "admin"
# -- Separator character for role/group values in JWT claim.
userRoleValueSeparator: ",
New features
-
Added optional role-based access control (RBAC). You can now limit access to H2O MLOps APIs to users with specific roles.
-
Released Base Python Scorer v1.1.1 (BYOM).
-
Released Python-based runtimes v0.6.5 (BYOM).
-
Released HT runtime v0.6.5 (BYOM).
-
Released MLflow runtime v0.6.5 (BYOM).
-
Exposed authorization header with bear token through Env Vars in base scorer (BYOM).
-
You can now set configurable read time out in scorer proxy (Monitor Proxy)
-
Added deployer configurable monitor proxy timeout ()Deployer)
-
Upgraded base images to Java 17 eclipse-temurin:17.0.10_7-jre (Deployer)
Bug fixes
- Handle error while fetching additional details of the events related to an experiment after deleting a deployment.
Version 0.65.1 (May 25, 2024)
This release is a minor release on top of v0.65.0 with the storage and telemetry features rebuilt using the latest zlib.
Version 0.65.0 (May 08, 2024)
This release includes new features, improvements, bug fixes, and security improvements.
New features
-
Added the capability to disable model monitoring features when deploying H2O MLOps. Set the
monitoring_enabled
installation parameter tofalse
to disable the following:- monitoring service
- monitoring task
- drift worker
- drift trigger
- InfluxDB
- RabbitMQ
infoNote that setting the
monitoring_enabled
parameter tofalse
also disables health checks for the monitoring backend. -
Support for FEDRAMP compliance.
-
Added an option to restrict model imports to specific types.
-
Added support for vLLM config model types.
-
When creating a deployment, added a deployment multi-issuer token security option. For more information, see Endpoint security.
-
The
ListProjects
API now returns all projects for admin users. -
You can now set a specific timeout only for external registry import API.
-
You can now upload LLM experiments using the MLOps Wave app.
-
Added support for Authz user format.
Improvements
-
By default, the model monitoring page is now sorted by deployment name.
-
You can now configure the supported experiment types for the upload experiment flow.
Bug fixes
- Fixed incorrect sorting in the
listMonitoredDeployments
API response.
Version 0.64.0 (April 08, 2024)
New features
-
MLflow Dynamic Runtime: Added support for Python 3.10.
-
Added support for DAI 1.10.7 and 1.10.6.2 runtimes.
-
Upgraded Rest scorer to Spring Boot 3 (1.2.0).
-
Added vLLM runtime support.
-
When creating a new deployment, added an option to disable monitoring for the deployment.
-
Added validation for experiment file uploading.
-
Extended scoring API with new endpoint
/model/media-score
to support uploading multiple media files. -
The H2O Hydrogen Torch runtime is now supported with the ability to score image and audio files against the new endpoint
/model/media-score
. -
The project page now includes an Events tab with pagination, search, and sorting. For more information, see Project page tabs.
-
You can now delete experiments.
-
Added pagination, search, sorting, and filtering by Tag on the Experiments page.
-
The Create Deployment workflow now automatically populates K8s limits and requests with the suggested default settings.
-
The deployment state is now updated dynamically on the Deployments page.
-
Additional details about error deployment states are now displayed in the MLOps UI.
-
You can now update and delete tags. Note that tags can only be deleted if they are not associated with any entity.
Improvements
- You can now edit the GPU request/limit fields.
- When creating a deployment, improved automatic population of the Kubernetes resource requests and limits fields in the UI based on the selected runtime and artifact type.
- H2O Driverless AI versions are now automatically identified when DAI models are uploaded through the Wave app or Python client.
- The deployment overview now displays additional details about errored deployment states.
Version 0.62.5
In addition to the changes included in the 0.64.0 release, this release includes the following changes:
Improvements
- The Deployer API now lets you create and update deployment settings related to what monitoring data you want to save. For example:
deployment.monitor_disable = True
deployment.store_scoring_transaction_disable = True
deployment = mlops_client.deployer.deployment.update_model_deployment(
mlops.DeployUpdateModelDeploymentRequest(deployment=deployment)
)
Changes
- Model Monitoring is now disabled by default for new deployments.
Known issues
-
Monitoring settings can only be modified using the Python client, regardless of whether they were initially set via the UI or Python client.
-
H2O MLOps version 0.62.5 cannot be upgraded to version 0.64.0. Upgrades from this version can only be made to version 0.65.0 and later.
Version 0.62.4
Improvements
- Various security improvements to address XSS security issues.
Version 0.62.1
New features
- You can now use the
ListExperiments
API to filter experiments by status (ACTIVE, DELETED). By default, the API returns ACTIVE experiments.
Improvements
-
Added support for the DAI 1.10.6.1 runtime.
-
Added pagination support in the Experiments page.
Bug fixes
-
Fixed an issue where uploading large artifacts (above 40GB) resulted in an error.
-
Fixed an issue where a registered model with the same name as a deleted model could not be created.
Announcements
-
The URL link to the legacy H2O MLOps app has been removed.
-
The legacy H2O MLOps app is no longer installed by default.
Version 0.62.0 (September 10, 2023)
New features
-
For GPU-enabled model deployments, you can now set the appropriate Kubernetes (K8s) requests and limits by clicking the GPU Deployment toggle when creating a deployment. For more information, see Deploy a model and Kubernetes options.
-
You can now create and assign experiment tags within a project. For more information, see Project page tabs and Add experiments.
-
You can now edit the names and tags of experiments. For more information, see Project page tabs.
Improvements
-
-
The default view when viewing projects has been changed from the grid view to the list view.
-
The Project ID of each project is now displayed in the list view.
-
The list view now features pagination, sorting, and search capabilities.
-
You can now search for a project by project name.
-
You can now sort the list of projects by time of creation and last modified time.
-
-
Project list view actions: You can now view, share, and delete projects from the project list view. For more information, see List view actions.
-
Improved UI for project sharing.
-
Enhanced the Deployment Overview window to include Kubernetes settings and deployed model details across all deployment modes. For more information, see Understand the Deployment Overview window.
-
Python client:
-
You can now enable or disable model monitoring for a deployment.
-
You can now update the deployment security option or password.
-
You can now delete experiments.
-
You can now delete Registered Model and Model Version.
-
-
Scoring:
-
Prediction intervals are now supported for MOJOs and Driverless AI Python scoring pipelines. Prediction intervals provide a range within which the true value is expected to fall with a certain level of confidence. You can check if prediction intervals are supported by using the
https://model.{domain}/{deployment}/capabilities
endpoint. -
Added a new MLflow Dynamic Runtime to dynamically resolve the various model dependencies in your MLflow model. For more information, see MLflow Dynamic Runtime.
Bug fixes
-
Fixed an issue where the passphrase field could not be edited when creating a secured deployment.
-
Fixed an issue that affected accurate sorting when using the sort by date functionality.
Version 0.61.1 (June 25, 2023)
Improvements
-
Added support for Kubernetes 1.25.
-
Added support for H2O Driverless AI version 1.10.5.
Bug fixes
- Various bug fixes to the deployment pipeline, monitoring, and drift detection.
Version 0.61.0 (May 24, 2023)
New features
- You can now create A/B Test and Champion/Challenger deployments through the UI. For more information, see Deploy a model.
- You can now create and view configurable scoring endpoints through the UI. For more information, see Configure scoring endpoint.
- Concurrent Scoring Requests are now supported for Python-based Scorers. Scoring times for for C++ MOJO, Scoring Pipeline, and MLflow types now support parallelization with the default degree of parallelization set to 2. This can be changed with the environment variable
H2O_SCORER_WORKERS
. For more details, contact your H2O representative.
Improvements
- Added support for H2O-3 MLflow Flavors and importing of MLflow-wrapped H2O-3 models.
Version 0.60.1 (April 02, 2023)
New features
- Introduced a feature flag to enable the import third-party experiments (pickled experiments) flow with Conda. If you require Conda or third-party pickle import, this flag needs to be set at the time H2O MLOps is installed to continue using pickled experiments. For more information about enabling this feature flag when installing or upgrading H2O MLOps, contact support@h2o.ai.
Improvements
-
You can now search for users by username when sharing a project with another user. You can now also sort the user list in alphabetical order.
-
In the model monitoring feature summary table, the figures are now displayed only up to three decimals places.
-
When no deployment name is present for the deployment, the deployment ID is now displayed as the name.
-
A blocking error page is now shown to the user in case Keycloak is unavailable.
-
Date and time are now both displayed for model monitoring predictions over time plot.
-
Storage Telemetry now includes the additional fields Deployment Name and model version number.
Bug fixes
-
Fixed a bug that caused experiments to fail during upload / ingestion.
-
All dialogs in the UI can now can be closed with the escape key.
-
Fixed a bug where drift was not previously calculated when a feature was determined to be a datetime type and the date time format was missing.
Version 0.59.1
Improvements
- Added support for the DAI 1.10.4.3 runtime.
Version 0.59.0 (February 12, 2023)
New features
- Storage telemetry: MLOps can now send analytical data related to storage operations to the telemetry server.
- Scoring telemetry: MLOps Scoring now sends scoring-related data to the telemetry server.
- Static scoring endpoints: You are now able to define and update a persistent URL that points to a particular MLOps deployment.
- Deployment:
- Deployed scoring applications now set additional Kubernetes annotations.
- Deployment APIs now return more accurate and useful gRPC status codes and error messages.
- You can now download Kubernetes logs from deployments in the MLOps Wave App and MLOps API.
Improvements
- Upgraded the
h2o-wave
version to 0.24.1. - Added support for the DAI 1.10.4.1 and DAI 1.10.4.2 runtimes.
- Updated the Python client.
- Added a cleanup task for files uploaded to the wave server.
- Updated the eScorer URL of the wave app deployment pipeline
- Added a new Kubernetes limit for the Hydrogen Torch runtime in the deployment creation flow.
Bug fixes
- Removed the custom implementation for the token provider.
- Removed the
artifact-id
from theDeployDeploymentComposition
endpoint. - Updated the packages in the base docker image.
- Fixed an issue related to displaying the session timeout page for deployment overview, view monitoring, and monitoring homepage.
- Fixed an issue where the drift detection trigger blocked the other calculations by adding timeout support to the InfluxDB client in trigger and worker.
Version 0.58.0 (December 15, 2022)
Improvements
- Added support for Kubernetes 1.23.
- Added support for H2O-3 MOJOs up to version
3.38.0.3
. - Added support for linking and deploying H2O Driverless AI unsupervised models.
- Added support for scoring H2O Driverless AI MOJOs with the C++ MOJO runtime.
- Added support for TTA for H2O Driverless AI Python pipelines.
- Shapley values can now be calculated for H2O Driverless AI Python pipelines and MOJOs.
- Datetime columns for H2O Driverless AI models are now automatically detected.
- Fixed an issue where the Driverless AI Python Pipeline scorer occasionally restarted randomly.
- Updated ML Python packages in the standard Python scorer to support a wider range of custom user models.
- BYOM scoring:
- Extended the Python scoring library to conform to v1.2.0 of the Scoring API.
- Unexpected input fields are now ignored when performing scoring.
- Introduced a feature that lets scorers override sample requests.
- Implemented an experimental API for image and file scoring.
- Replaced time-based handling of signals coming from Driverless AI scoring processes with static handling.
- Added a Driverless AI MOJO Pipeline artifact processor image.
- Added an H2O-3 artifact processor image.
- Updated the DAI pipeline processor dependencies to address security vulnerabilities.
Documentation
-
Added a page that describes support for Test Time Augmentation (TTA) in H2O MLOps.
-
Added several new Python client examples.
-
Updated the page on Deploying a model.
Version 0.57.3 (November 16, 2022)
New features
- You can now view monitoring dashboards for deployments directly through H2O MLOps. For more information, see Model monitoring.
Version 0.57.2 (August 01, 2022)
New features
-
When browsing the MLflow directory, you can now search for specific MLflow models by name. Note that this search functionality is case sensitive, and that the model name can contain only letters, numbers, spaces, hyphens, and underscores up to 100 characters.
-
When browsing the MLflow directory, the list of MLflow models is now organized into pages. You can specify the number of models listed on each page.
Bug fixes
- Fixed an issue where MLflow models could not be reimported.
Version 0.56.1 (May 16, 2022)
New features
- Azure access tokens can now be retrieved through H2O MLOps.
Improvements
-
When creating a deployment, only deployable artifacts are now shown.
-
Added Driverless AI (DAI) 1.10.2 and 1.10.3 as recognized versions of DAI for matching with DAI runtimes.
-
H2O MLOps now displays either a success or error message when attempting to create a deployment.
-
The process of linking models to an experiment is now simpler.
-
H2O MLOps can now handle large text fields.
-
Updated the H2O MLOps logo.
-
Removed scroll bars in overview UI pages.
Bug fixes
-
Fixed an issue that caused alignment issues between project cards.
-
Underscores can now be used at the beginning of project names.
-
Fixed an issue that caused H2O MLOps to crash when the deployer was restarted.
-
Fixed an issue related to adding new comments to an experiment.
Version 0.56.0 (April 18, 2022)
New features
-
Added support for batch scoring. For more information, see Deploying a model.
-
Added support for H2O-3 MOJOs up to version
3.32.0.2
.
Version 0.55.0 (March 31, 2022)
New features
- Added support for integration with MLflow Model Registry.
- Admin users can now monitor H2O MLOps usage within their organization with Admin Analytics.
Documentation
-
Added a new page on enabling third-party model management integration.
-
Added a new section on adding experiments from MLflow Model Registry.
Version 0.54.1 (March 08, 2022)
New features
- H2O Driverless AI (DAI) 1.10.2 is now supported. Experiments trained in DAI 1.10.2 can now be managed and deployed by H2O MLOps.
Version 0.54.0 (February 03, 2022)
- New MLOps user interface.
- Pickle model support: Python serialized models in Pickle format can now be imported directly into MLOps. This means that you can use your third-party models without relying on packagers like MLflow.
- Model Registry and Model Versioning: You can now register your experiments using MLOps Model Registry and group new versions of a model using MLOps Model Versioning. Note that an experiment must first be registered in the MLOps Model Registry before being deployed. For more information, see Register an experiment as a model.
Version 0.53.0 (January 18, 2022)
Notice
- Updated required MLOps Terraform providers to benefit from bug fixes and expanded support for setting Kubernetes options. Note that upgrading MLOps with the updated Terraform templates results in Terraform generating a lengthy state file differential to review.
Improvements
- Added three new MOJO scorers to the default MLOps configuration. Each of these scorers provide support for returning Shapley values along with model scoring.
- By default, all MLOps components now run as non-root users.
- By default, all third-party services deployed by MLOps except for RabbitMQ and Traefik run as non-root by default.
- Added support for setting a subset of Kubernetes Security Context options for any BYOM image.
- Exposed many new MLOps configuration fields as Terraform variables.
- Extended model scorers' capabilities to recover from connection and timeout issues.
- Exposed option to set arbitrary Kubernetes resource requests and limits for MLOps model deployment.
- Exposed option to set number of desired Kubernetes pods for model deployments.
- Fixed an issue where deployments reported incorrect last modified timestamps.
- Added
name
anddescription
fields to model deployment API objects, allowing deployments to be user-labelled. - Fixed an issue where MLOps' Deployer complained if certain BYOM configurations were missing. Defaults are now correctly applied unless overridden.
- Fixed an issue where one of Deployer's APIs was not exposed with the MLOps API. Known and available deployment environments (that is, Kubernetes clusters) may now be queried with the MLOps API.
- BYOM containers can now have their log levels be globally configurable.
- Exposed a number of configuration fields for bundled third-party services.
- Reduced factor of Kubernetes API calls needed to be made by the deployment pipeline.
- Fix issue where a few dozen concurrent deployment processes could exhaust maximum allowed connections originating from the Deployer service.
- Set scalability-minded options for resources deployed onto Kubernetes, significantly reducing CPU, memory, and network load at scale.
- Exposed configuration fields for many internal, as well as Kubernetes-facing, timeouts options.
- Fixed configuration issue that would cause Ambassador pods to be put up for eviction after only hundreds of models were deployed.
Documentation
-
Added new page on node affinity and toleration.
-
Added new page on Shapley values support.
-
Added information on new Kubernetes options.
-
Revised section on deploying models.
Version 0.52.1 (November 17, 2021)
New features
- Added support for Driverless AI (DAI) 1.10.0 (Supervised Models).
- Added new configuration options that let you push scoring data to a Kafka topic for monitoring purposes.
Improvements
- Experiments with metadata larger than 100 MB are now supported. The new limit is 1000 MB.
Version 0.52.0 (September 13, 2021)
New features
- Added support for Driverless AI (DAI) 1.9.3 Python pipelines.
- DAI Python pipelines must be imported either through the MLOps UI or programmatically by using the MLOps API to deploy. They cannot currently be deployed directly from the project.
- Ambassador timeout can now be configured per runtime with the
request-timeout
parameter in the Deployer configuration. Note that this parameter can also be set for any new BYOM runtime added to MLOps.
Improvements
- Added the ability to configure whether BYOM runtimes have write access to the volume hosting the model it's scoring.
- Exposed Terraform variables to make specifying custom BYOM entities easier.
- Added support for blob storages from public cloud storage services.
- Limited the number of error notifications displayed in the UI so that only one error is displayed at a time. Error notifications are now automatically cleared when the error condition disappears.
Version 0.51.0 (August 20, 2021)
Improvements:
- Implemented integration with Kafka for pushing scoring data.
Version 0.50.1 (August 04, 2021)
Improvements:
- Updated default Python runtimes with improved error handling.
- For secure environments, added a
terraform
flag for disabling BYOM.
Bug Fixes:
- For Python models, fixed a UI issue that caused complex deployments to be unsupported.
Version 0.50.0 (July 29, 2021)
New Features:
-
Added support for third-party Python models.
- Currently tested and supported versions include scikit-learn 0.24.2, PyTorch 1.9.0, XGBoost 1.4.2, LightGBM 3.2.1 and TensorFlow 2.5.0.
- Added selectable artifact types and runtimes for all types of artifacts and models.
Improvements:
- Added new deployer endpoints for creating, listing and deleting deployments.
- Changed the MLOps client package name from
mlops
toh2o_mlops_client
. - Renamed deployment template input variable from
model_ingestion_image
tomodel_ingest_image
to be consistent with the image name. - Renamed deployment template input variable from
gateway_image
togrpc_gateway_image
to be consistent with the image name.
Version 0.41.2 (June 2021)
Improvements:
- Added support for Driverless AI 1.9.3 MOJOs.
Version 0.41.1 (June 2021)
Improvements:
- Improved deployer logging.
Bug Fixes:
- Fixed an issue that caused installation through Terraform to not provide MLOps with all required configuration.
Version 0.41.0 (May 25, 2021)
Improvements:
- Added drag-and-drop option for importing Driverless AI MOJOs.
Bug Fixes:
- Fixed an issue that caused a broken download link to be generated for the MLOps gRPC-Gateway image.
Documentation:
- Added info on Driverless AI version compatibility.
- Added info on the MLOps API URL.
- Added info on the Token Endpoint URL.
Version 0.40.1 (March 15, 2021)
Improvements:
- Added alert messages to Grafana.
- Added pagination support for Project and DeployEnvironment list retrievals.
- Improved Model Fetcher logging.
Bug Fixes:
- Fixed an issue where some Model Fetcher processes were not checked for errors.
- Fixed an issue where some deployments got stuck in the Init phase when too many deployments started or restarted at the same time.
- Fixed a UI inconsistency between the Deployments and Models sections when no entries were displayed.
- Fixed a UI issue that caused the
Add new project
window to remain on the screen after successfully creating a project. - Fixed an issue that allowed users to be registered without a username.
- Fixed an issue that caused models with one or more typos in their metadata to fail when deploying.
- Fixed an issue where H2O-3 models could not be deployed.
- Fixed an issue where some Driverless AI 1.9.1 models could not be deployed.
Version 0.40.0 (January 14, 2021)
New Features:
- Added Python API support.
Improvements:
- Added Model Fetcher to deploy scorers without a persistent volume.
Bug Fixes:
- Fixed an issue where the deployer remained in the 'Preparing' state indefinitely when a model had an unsupported transformer.
- Fixed an issue where models appeared in projects that they did not belong to.
- Fixed an issue that caused model selection to persist between different projects.
- Fixed an issue where the deployer did not clean up after fetching artifacts.
- Fixed an issue where certain menu items on the Projects page did not work as intended.
Version 0.31.3 (December 02, 2020)
Improvements:
- Driverless AI instances can now be run in a different namespace from storage namespace.
- Users can now override the default ingress class.
Bug Fixes:
- Fixed an issue that stopped project summary alerts from being updated.
Version 0.31.2 (November 11, 2020)
Improvements:
- Removed one PROD model per project restriction.
- Added a demo mode to the Studio page so that the default is more secure.
- Added optional password protection for Grafana.
Bug Fixes:
- Fixed an issue that stopped project summary alerts from being updated correctly.
- Fixed an issue that caused the alerts page to crash when a deployment had multiple alerts of mixed types.
- Fixed an issue that stopped the number of model pages from being updated correctly.
- Fixed an issue that caused all metadata to be fetched when listing experiments.
Version 0.31.1
Skipped and rolled in to 0.31.2.
Version 0.31.0 (October 21, 2020)
New Features:
- Added model endpoint security. Users can enable and configure authentication when deploying a model.
Bug Fixes:
- Fixed an issue where the sample cURL request for an endpoint with a hashed passphrase did not have an input box.
- Fixed an issue where single character passphrases were ignored.
- Fixed an issue where the set passphrase dialog did not appear for Champion/Challenger and A/B deployments.
Version 0.30.1 (October 08, 2020)
New Features:
- Added user-friendly H2O-3 model import support.
Improvements:
- Improvements in sorting/pagination.
Bug Fixes:
- Fixed issues with deployments list.
- Set default page size for lists to 10 pages.
- Various bug fixes
Version 0.30.0
Skipped and rolled in to 0.31.1.
Version 0.22.0 (July 30, 2020)
Bug Fixes:
- Fixed an issue where UI elements overlapped in Firefox.
- Fixed an issue where users could not log back in to MLOps after the session cookie expired.
- Fixed an issue where the Ambassador pod failed to start.
Version 0.21.1 (July 07, 2020)
Bug Fixes:
- Made the software version number visible in the UI.
- Added table pagination according to deployment pipeline design.
- Fixed an issue that caused the model actions drop-down menu to appear empty.
- Fixed an issue where models linked from Driverless AI could not be deleted.
- Fixed an issue where unfinished Driverless AI experiments could not be linked.
- Made the delete action unavailable to users with the Reader role.
- Fixed an issue where deployments were reported as having failed after pods were restarted.
- Fixed an issue where the scoring data for an experiment linked to more than one project was not stored in InfluxDB.
Version 0.21.0 (June 12, 2020)
New Features:
- Added drift detection analysis for models.
- Added A/B testing to compare the performance of two or more models.
- Added Champion/Challenger deployments.
Bug Fixes:
- Increased the default timeout for waiting for a pod to provision when deploying.
- Fixed an issue that stopped deployments from being listed for challenger models.
- Fixed an issue that caused MLOps to crash when a feature field was not found in the drift report.
- Fixed an issue that caused the A/B Test link to remain active when no model was selected.
- Fixed an issue on the Projects page that caused the delete model action to not work correctly.
- Fixed an issue in the Grafana dashboard that caused the scoring latency graph to appear as having no data.
- Fixed an issue that stopped collaborators from being able to create deployments when they were not restricted from doing so.
- For the Reader user role, fixed an issue that caused incomplete error messages to appear for failed user actions.
- Fixed an issue that caused the filtering option to disappear from the Models page.
- Fixed an issue where undeploying a model that was a part of multiple deployments did not work correctly.
- Fixed an issue that caused the 'More details' action to become activated when 'Monitoring' was selected from the Actions menu.
Version 0.20.1 (April 02, 2020)
Bug Fixes:
- Fixed an issue that stopped the user interface from accessing storage after restarting all pods.
- Fixed an issue that caused PostgreSQL data to be purged when the pod was restarted.
Version 0.20.0 (April 01, 2020)
- First stable release.
- Submit and view feedback for this page
- Send feedback about H2O MLOps to cloud-feedback@h2o.ai