Permutation Variable Importance¶
Permutation variable importance is obtained by measuring the distance between prediction errors before and after a feature is permuted; only one feature at a time is permuted.
The model is scored on a dataset
D, this yields some metric value
orig_metric for metric
Permutation variable importance of a variable
V is calculated by the following process:
Vis randomly shuffled using Fisher-Yates algorithm.
The model is scored on the dataset
Dwith the variable
Vreplaced by the result from step 1. this yields some metric value
perm_metricfor the same metric
Permutation variable importance of the variable
Vis then calculated as
abs(perm_metric - orig_metric).
M can be set by metric argument. If set to
AUC is used for binary classification,
logloss is used for multinomial classification, and
RMSE is used for regression.
model: A trained model for which it will be used to score the dataset.
frame: The dataset to use, both train and test frame are can be reasonable choices but the interpretation differs (see Should I Compute Importance on Training or Test Data? from the Interpretable Machine Learning by Christoph Molnar.).
metric: The metric to be used to calculate the error measure. One of
PR_AUC. Defaults to
n_samples: The number of samples to be evaluated. Use -1 to use the whole dataset. Defaults to 10 000.
n_repeats: The number of repeated evaluations. Defaults to 1.
features: The features to include in the permutation importance. Use None to include all.
seed: The seed for the random generator. Use -1 to pick a random seed. Defaults to -1.
n_repeats == 1, the result is similar to the one from
h2o.varimp(), i.e., it contains the following columns
“Relative Importance”, “Scaled Importance”, and “Percentage”.
n_repeats > 1, the individual columns correspond to the permutation variable importance values from individual
runs which corresponds to the “Relative Importance” and also to the distance between the original prediction error and
prediction error using a frame with a given feature permuted.