Skip to main content
Version: v0.17.0

Settings: Drift detection

Overview

H2O Model Validation offers an array of settings for a drift detection test. Below, each setting is described in turn.

Settings

Test name

This setting defines the name of the validation test. By default, H2O Model Validation assigns a name to the validation test that you can rewrite.

Primary dataset

This setting defines one of the two datasets H2O Model Validation uses during the validation test to identify changes in the distribution of variables between the primary and secondary datasets. H2O Model Validation performs drift detection using the primary and secondary datasets captured at different times to asses how data has changed over time.

note

Models: Within the context of validating a model, the defined primary dataset needs to follow the structure of the model's training dataset.

Secondary dataset

This setting defines one of the two datasets H2O Model Validation uses during the validation test to identify changes in the distribution of variables between the primary and secondary datasets. H2O Model Validation performs drift detection using the primary and secondary datasets captured at different times to assess how data has changed over time.

note

The defined primary dataset dictates the required format for the secondary dataset (similar columns).

Columns to drop

This setting defines the columns H2O Model Validation drops during the validation test. Typically, dropped columns refer to columns that can indicate a drift without an impact on the model, such as columns not used by the model, record IDs, time columns, etc.


Feedback