Version: v0.16.0

Settings: Segment performance

H2O Model Validation offers an array of settings for a segment performance test. Below, each setting is described in turn.

Test name

Defines the name for the validation test; by default, H2O Model Validation assigns a name to the test that you can rewrite.

Model

Defines the model H2O Model Validation utilizes to run the segment performance test.

Model train dataset

note

Model train dataset refers to one of the model's informational points, not a setting. This informational point refers to the model's training datase

Primary dataset

Defines the dataset H2O Model Validation utilizes to run the segment performance test. H2O Model Validation applies the model to the dataset, bins the data, and calculates the performance statistics.

To run a segment performance test on a model, H2O Model Validation utilizes a provided dataset to generate model predictions to assess their accuracy. H2O Model Validation splits the dataset into segments by the bins of values of every variable and every pair of variables to generate results around the ability of the model to produce accurate predictions with different data segments. These results are embedded into a bubble graph that H2O Model Validation generates that enables you to observe and explore data segments the model struggles, outperformance, and performs with when generating accurate predictions. For each segment, H2O Model Validation calculates its size of it relative to the size of the dataset and estimates the error the model makes on the corresponding segment.

note

The defined primary dataset needs to follow the model's training dataset format.

Columns to drop

Defines the columns H2O Model Validation drops when assessing the data segments.

Number of bins

Defines the number of bins H2O Model Validation utilizes to split the variable values of the primary dataset. In the case of a categorical column, H2O Model Validation utilizes the appropriate categories while ranging numerical columns into a specified number of bins.

To run a segment performance test on a model, H2O Model Validation utilizes a provided dataset to generate model predictions to assess their accuracy. H2O Model Validation splits the dataset into segments by the bins of values of every variable and every pair of variables to generate results around the ability of the model to produce accurate predictions with different data segments. These results are embedded into a bubble graph that H2O Model Validation generates that enables you to observe and explore data segments the model struggles, outperformance, and performs with when generating accurate predictions. For each segment, H2O Model Validation calculates its size of it relative to the size of the dataset and estimates the error the model makes on the corresponding segment.

Feedback

Submit and view feedback for this page
Send feedback about H2O Model Validation to cloud-feedback@h2o.ai

Test name​

Model​

Model train dataset​

Primary dataset​

Columns to drop​

Number of bins​