Release notes
Version 0.69.7 (Feb 17, 2025)
Fixes
- [Deployer] Resolved an issue where resource limit specifications were not correctly applied to runtime processors.
Version 0.69.6 (Feb 13, 2025)
Fixes
- [Security] Applied security patches from the latest major release.
- [Deployer] Fixed an issue by adding the missing volume mount for Kafka TLS-enabled deployments.
Version 0.69.5 (Feb 6, 2025)
Fixes
- [Wave App] Prevented redirection to
#projects
while filling out thecreate deployment
form.
Version 0.69.4 (Jan 21, 2025)
Fixes
- [Helm Chart] Updated the InfluxDB network policy to allow connections from pods with any of the required labels.
Version 0.69.3 (Jan 17, 2025)
Fixes
- [Wave UI] Fixed an issue where Driverless AI version 1.11.1.1 was incorrectly displayed as 1.11.1 in the UI.
Version 0.69.2 (Jan 14, 2025)
This release includes new features and fixes.
New features
- [Deployment Updater] Added functionality to update the image repository during deployment update jobs.
Fixes
- [Deployer] Fixed
CVE-2023-3635
.
Version 0.69.1 (Jan , 2025)
This release includes a new feature.
New features
- [Runtimes] Added support for the Driverless AI runtime version 1.11.1.1.
Version 0.69.0 (Dec 19, 2024)
This release includes new features and fixes.
New features
- [Runtimes] Added support for MLflow Model Scorer runtime for Python 3.11 and 3.12, and Dynamic MLflow Model Scorer runtime for Python 3.12.
- [Runtimes] Added support for H2O Driverless AI runtime version 1.10.7.3.
- [Runtimes] Exposed Kubernetes readiness probe on deployments.
- [UI] Replaced bcrypt with PBKDF2 hashing when creating deployments with the
Passphrase (Stored Hashed)
security option. - [Python Client] Added support for SSL settings via the Client constructor.
- [Python Client] Added support for PBKDF2 hashed security option.
- [Deployer] Introduced PBKDF2-based passphrase hashing for improved security.
- [Deployer] Added support for Generic Ephemeral Volumes in the Runtimes.
- [Deployer] Introduced
/readyz
readiness probe endpoint for dynamically deployed runtimes. - [Deployer] Introduced a pod disruption budget for enhanced stability.
- [Helm] Enabled JVM config passing to the monitoring proxy.
- [Helm] Added support for configuring limits and JVM settings in the deployer.
- [Helm] Defined
MLOPS_WAVE_APP_URL
as an environment variable for better configuration.
Fixes
- [UI] Ensured UI accessibility even when listing deployments fails.
- [UI] Fixed the issue where signing in from the access denied page resulted in an
Missing parameters: id_token_hint
error. - [Monitor Proxy] Stopped sending
TransactionTransmission
events to downstream transmitters whenenableTransaction
isfalse
. - [Python Client] Added
ValueError
for missing or invalid protocol in the gateway URL. - [Python Client] Configured SSL settings for the functions
get_capabilities
,get_sample_request
, andget_schema
since they access deployment endpoints. - [Storage] Removed storage cleanup cron job and implemented it within a thread of storage itself.
- [Helm] Ensured proper RBAC configuration when multiple groups are specified.
- [Helm] Removed legacy
LOCAL
andMIGRATE
mode code. - [Runtimes] Fixed memory leak in MOJO2 runtimes by upgrading the internal MOJO2 library.
- [Deployer] Ensured that stale deployments will be redeployed.
- [Deployer] Skipped routing migration in cases of errors not related to deployment migration.
- [Deployer] Used response header modifier instead of request header modifier for CORS.
- [Deployer] Added configurable Kubernetes client timeout for better performance and reliability.
Version 0.68.0 (Nov 05, 2024)
This release includes new features and fixes.
New features
- [UI] Enabled model and model version deletion.
- [UI] Enabled to use the default deployment security option from the backend.
- [UI] Added support for H2O Driverless AI runtime versions 1.10.7.2 and 1.11.1.
- [Python Client] Introduced a timeout parameter (default: 5 seconds) for MLOpsScoringDeployment's methods:
get_capabilities
,get_sample_request
, andget_schema
. - [Python Client] Added support for creating deployment with token-based authentication as a security option.
- [Python Client] Enabled model deletion.
- [Python Client] Enabled the option to unregister an experiment from a model.
- [Python Client] Introduced the
disabled_security
option to manage deployments with No-Security. - [Storage] Storage only supports blob storage from this release onwards. A one-time migrator job was introduced to migrate all the storage data from K8S PVC to blob storage to support seamless upgrades for users.
- [Telemetry] The MLOps-Telemetry component is no longer running as a cron job; it is now a long-running microservice that publishes event data at scheduled intervals.
- [Helm] MLOps storage can be configured to use blob storages from any of the 3 main clouds AWS, Azure and GCP. Minio is also supported for on-premise installations.
- [Helm] Added
H2O_SCORER_MODEL_LOADING_MODE
set to "subprocess" across all MLOps Python-based runtimes. - [Helm] Introduced a migration job for transferring persistent storage to cloud platforms, now supporting Minio and Azure Blob.
- [Helm] Introduced a
SCHEDULER_INTERVAL_SECONDS
environment variable to configure the interval of mlops-telemetry events publishing. - [Deployer] Introduced Vertical Pod Autoscaling (VPA) support.
- [Deployer] Exposed easy access to the security options available in the cluster.
- [Deployer] Restructured environment security options:
- Activated security options list
- Configurable default security option
- [Deployer] Introduced the No-Security option.
Fixes
- [UI] Resolved an error occurring when attempting to view experiment details for experiments with missing metadata.
- [UI] Made the maximum selectable count for deployment replicas configurable.
- [UI] Removed support for MLflow Model Scorer and Dynamic Model Scorers for Python 3.8.
- [UI] Removed support for HT Flexible Runtimes for Python 3.8, including both GPU and CPU variants.
- [Python Client] Improved handling of missing deployment attributes (security and monitor) in backend responses.
- [Python Client] Upgraded the minimum supported Python version to 3.9.
- [gRPC Gateway] Updated
/healthz
to return a 200 status if at least one health check passes, fixing an issue where the gateway would restart if any service was unhealthy. - [Helm] Removed Python 3.8 support for HT and MLFlow runtimes.
- [Helm] Removed the
EnableUserExternalIDUpdate
environment variable from storage for simpler configuration. - [Helm] Added a
-job
suffix to theapp.kubernetes.io/
component label for the monitoring backend job to improve component labeling. - [Helm] Updated rclone configurations to enhance compatibility with Google Cloud Storage (GCS).
- [Helm] Set the telemetry service’s replica count to one to optimize resource usage.
- [Helm] Changed the telemetry scheduler’s default interval to 300 seconds for more efficient scheduling.
- [Storage]
IDP_ID
(i.e. keycloak/ Okta ID) is now used as the primary key for the Users table in MLOps Storage. The username is also not a unique field anymore. Existing user data will be migrated accordingly by the Storage itself when it's spinning up. [Deployer] Only "internal" grpc status are now logged at the ERROR level.
Version 0.67.4 (Oct 10, 2024)
This release includes various fixes.
Fixes
- [Helm] Gateway creation is now skipped when
Values.gatewayApi.create
is set to false. - [Helm] You can now specify extra ingress for Influx.
Version 0.67.3 (Oct 01, 2024)
This release includes new features and fixes.
New features
- [Runtimes] Added support for the Driverless AI 1.11.1 Python scoring pipeline.
Fixes
- [Security] Fixed critical vulnerabilities on Java-based rest scorer and monitoring proxy.
- [Helm] Ensure that registry specification on each image has higher priority over the global image registry configuration.
Version 0.67.2 (Sep 19, 2024)
This release includes new features and fixes.
New features
- [Runtimes] Added support for the Driverless AI 1.10.7.2 Python scoring pipeline.
Fixes
- [Helm] Removed hard-coded dev/vorvan prefix.
- [Helm] Influx network policy was missing a specific label which lead to cleanup job not running.
Version 0.67.1 (Sep 13, 2024)
This release includes various fixes.
Fixes
- [Monitoring Backend] Updated Dockerfile to use numerical user ID, preventing false warnings in systems that check for root access.
- [Drift] Fixed an issue where the worker image could not find the
datatable
dependency.
Version 0.67.0 (Sep 02, 2024)
This release includes various vulnerability fixes.
New features
- [Monitor Proxy] Per project monitoring data retention period can be set for Influx DB during the MLOps installation or upgrade.
- [UI] Added the functionality to log out from the Wave app.
- [UI] Added support for new HT Flexible Runtimes for Python
3.10
, including GPU and CPU variants. - [UI] Added support for DAI runtime versions
1.10.6.3
,1.10.7.1
, and1.11.0
. - [UI] Added support for MLflow Model Scorer and Dynamic Model Scorers for Python
3.10
and3.11
. - [Deployer] Added support for token based authentication for deployments.
- [Runtimes] Added support for DAI runtime versions
1.10.6.3
,1.10.7.1
, and1.11.0
. - [Runtimes] Added support for new HT Flexible Runtimes for Python
3.10
. - [Helm Chart] Added component configuration support for applying tolerations, node selectors, and affinity settings to cron jobs.
- [Helm Chart] Added CA certificate support to the API Gateway deployment.
- [Helm Chart] Replaced Ambassador with Gateway API due to the removal of Emissary.
Fixes
- [UI] Removed the functionality for importing models from external model repositories.
- [UI] Removed the ability to upload experiments as serialised Python (
.pkl
/.pickle
) files. - [UI] Disallowed the creation of tags with commas.
- [UI] Reduced the timeout for notification bars.
- [UI] Fixed the issue where a red cross appeared when registering a model shortly after creating an experiment.
- [UI] Removed support for DAI Python runtimes for
1.10.4.3
and older versions. - [UI] Removed support for MLFlow Model Scorer for Python
3.6
and Python3.7
. - [Runtimes] Removed support for DAI Python runtimes for
1.10.4.3
and older versions. - [Python Client] The
external_registry
package has been removed. - [Runtimes] Fix critical vulnerabilities in all runtimes except DAI Python based one.
- [Runtimes] Fix critical and high vulnerabilities in rest scorer.
- [Helm Chart] Corrected the nodeSelector YAML formatting.
- [Helm Chart] Renamed environment variable
STORAGE_URL
toAPI_GATEWAY_URL
in the Wave app. - [Helm Chart] Updated
H2O_WAVE_POST_REDIRECT_URL
to resolve "Page Not Found" errors when logging out from the Wave app. - [Helm Chart] Updated the Wave app secret
H2O_WAVE_OIDC_END_SESSION_URL
for improved logout functionality. - [Helm Chart] The
enable_user_externalid_update
setting is now configurable. - [Helm Chart] Exposed resource requests and limits for monitoring-drift, model-ingest, and api-gateway components.
Version 0.66.1
This release includes various vulnerability fixes.
New features
-
Released Base Python Scorer v1.2.0 (BYOM).
-
Released Python based runtimes v1.2.0 (BYOM).
-
Released HT runtime v1.2.0 (BYOM).
-
Released MLflow runtime v1.2.0 (BYOM).
Version 0.66.0 (June 04, 2024)
This release includes new features, improvements, bug fixes, and security improvements.
Announcement
This version of H2O MLOps adds optional role-based access control (RBAC). This feature relies on two RBAC configurations: one for the front end (FE) and the other for the back end (BE). When using RBAC, both configurations must be set up identically to ensure proper functionality and a seamless user experience.
The following is an illustrative configuration for both FE and BE RBAC. Note that in this example, it is assumed that "admin" is included in the user access token groups claim. However, it is important to customize this configuration based on your specific requirements at the time of deployment.
apiGateway:
config:
# -- Log verbosity.
logLevel: "debug"
authorization:
# -- Whether authorization is enabled.
enabled: true
# -- JWT claim key which contains the role/group information.
userJwtClaim: "groups"
# -- List of role/group values that should have access to MLOps API Gateway as an array.
allowedUserRoles: ["admin"]
waveApp:
authorization:
# -- Whether authorization is enabled.
enabled: true
# -- JWT claim key which contains the role/group information.
userJwtClaim: "groups"
# -- Comma separated list of role/group values that should have access to MLOps FE.
allowedUserRoles: "admin"
# -- Separator character for role/group values in JWT claim.
userRoleValueSeparator: ",
New features
-
Added optional role-based access control (RBAC). You can now limit access to H2O MLOps APIs to users with specific roles.
-
Released Base Python Scorer v1.1.1 (BYOM).
-
Released Python-based runtimes v0.6.5 (BYOM).
-
Released HT runtime v0.6.5 (BYOM).
-
Released MLflow runtime v0.6.5 (BYOM).
-
Exposed authorization header with bear token through Env Vars in base scorer (BYOM).
-
You can now set configurable read time out in scorer proxy (Monitor Proxy)
-
Added deployer configurable monitor proxy timeout ()Deployer)
-
Upgraded base images to Java 17 eclipse-temurin:17.0.10_7-jre (Deployer)
Bug fixes
- Handle error while fetching additional details of the events related to an experiment after deleting a deployment.
Version 0.65.1 (May 25, 2024)
This release is a minor release on top of v0.65.0 with the storage and telemetry features rebuilt using the latest zlib.
Version 0.65.0 (May 08, 2024)
This release includes new features, improvements, bug fixes, and security improvements.
New features
-
Added the capability to disable model monitoring features when deploying H2O MLOps. Set the
monitoring_enabled
installation parameter tofalse
to disable the following:- monitoring service
- monitoring task
- drift worker
- drift trigger
- InfluxDB
- RabbitMQ
infoNote that setting the
monitoring_enabled
parameter tofalse
also disables health checks for the monitoring backend. -
Support for FEDRAMP compliance.
-
Added an option to restrict model imports to specific types.
-
Added support for vLLM config model types.
-
When creating a deployment, added a deployment multi-issuer token security option. For more information, see Endpoint security.
-
The
ListProjects
API now returns all projects for admin users. -
You can now set a specific timeout only for external registry import API.
-
You can now upload LLM experiments using the MLOps Wave app.
-
Added support for Authz user format.
Improvements
-
By default, the model monitoring page is now sorted by deployment name.
-
You can now configure the supported experiment types for the upload experiment flow.
Bug fixes
- Fixed incorrect sorting in the
listMonitoredDeployments
API response.
Version 0.64.0 (April 08, 2024)
New features
-
MLflow Dynamic Runtime: Added support for Python 3.10.
-
Added support for DAI 1.10.7 and 1.10.6.2 runtimes.
-
Upgraded Rest scorer to Spring Boot 3 (1.2.0).
-
Added vLLM runtime support.
-
When creating a new deployment, added an option to disable monitoring for the deployment.
-
Added validation for experiment file uploading.
-
Extended scoring API with new endpoint
/model/media-score
to support uploading multiple media files. -
The H2O Hydrogen Torch runtime is now supported with the ability to score image and audio files against the new endpoint
/model/media-score
. -
The project page now includes an Events tab with pagination, search, and sorting. For more information, see Project page tabs.
-
You can now delete experiments.
-
Added pagination, search, sorting, and filtering by Tag on the Experiments page.
-
The Create Deployment workflow now automatically populates K8s limits and requests with the suggested default settings.
-
The deployment state is now updated dynamically on the Deployments page.
-
Additional details about error deployment states are now displayed in the MLOps UI.
-
You can now update and delete tags. Note that tags can only be deleted if they are not associated with any entity.
Improvements
- You can now edit the GPU request/limit fields.
- When creating a deployment, improved automatic population of the Kubernetes resource requests and limits fields in the UI based on the selected runtime and artifact type.
- H2O Driverless AI versions are now automatically identified when DAI models are uploaded through the Wave app or Python client.
- The deployment overview now displays additional details about errored deployment states.
Version 0.62.5
In addition to the changes included in the 0.64.0 release, this release includes the following changes:
Improvements
- The Deployer API now lets you create and update deployment settings related to what monitoring data you want to save. For example:
deployment.monitor_disable = True
deployment.store_scoring_transaction_disable = True
deployment = mlops_client.deployer.deployment.update_model_deployment(
mlops.DeployUpdateModelDeploymentRequest(deployment=deployment)
)
Changes
- Model Monitoring is now disabled by default for new deployments.
Known issues
-
Monitoring settings can only be modified using the Python client, regardless of whether they were initially set via the UI or Python client.
-
H2O MLOps version 0.62.5 cannot be upgraded to version 0.64.0. Upgrades from this version can only be made to version 0.65.0 and later.
Version 0.62.4
Improvements
- Various security improvements to address XSS security issues.
Version 0.62.1
New features
- You can now use the
ListExperiments
API to filter experiments by status (ACTIVE, DELETED). By default, the API returns ACTIVE experiments.
Improvements
-
Added support for the DAI 1.10.6.1 runtime.
-
Added pagination support in the Experiments page.
Bug fixes
-
Fixed an issue where uploading large artifacts (above 40GB) resulted in an error.
-
Fixed an issue where a registered model with the same name as a deleted model could not be created.