Skip to main content
Version: v0.15.0

Drift detection

Drift detection refers to a validation test that enables you to identify changes in the distribution of variables in your model's input data, preventing model performance degradation.

H2O Model Validation performs drift detection using the train and another dataset captured at different times to assess how data has changed over time. The Population Stability Index (PSI) formula is applied to each variable to measure how much the variable has shifted in distribution over time. PSI is applied to numerical and categorical columns and not date columns. The PSI formula is as follows:

Variables with a higher PSI indicate a higher drift. Important variables in a model with a high PSI increase the likelihood of performance deterioration while requiring model retraining.