Ensemble Learning in Driverless AI

This section describes Driverless AI’s ensemble learning capabilities.

Ensemble Method

The ensemble method is a linear model with non-negative weights. The weights are assigned at the model level, not the fold level. For example, if 2 models were ensembled together (a LightGBM model and an XGBoost model), then the linear model will find the weight to assign all LightGBM CV models and the weight to assign all XGBoost CV models. When Driverless AI ensembles a single model (level 1), then it is taking the average of the CV model predictions because the weights are assigned at the model level.

Ensemble Levels

Driverless AI has multiple ensemble levels that are tied to the accuracy knob. As accuracy increases, the ensemble level increases. The following is a description of each ensemble level:

  • level 0: No ensemble, only a final single model. Cross validation is only used to determine the model validation performance. The final model is trained on the whole dataset.

  • level 1: Cross validation is performed for 1 model and the CV model predictions are ensembled.

  • level 2: Cross validation is performed for 2 models and the CV model predictions are ensembled. For example, Driverless AI may choose to ensemble an XGBoost model and a LightGBM model. The ensembling is done by blending the predictions from the cross validation XGBoost models and cross validation LightGBM models. If Driverless AI has decided on 5-fold cross validation, then 10 models will be ensembled (5 CV models from the XGBoost model and 5 CV models from the LightGBM model).

  • level 3: Same as level 2 but with 3 models.

  • level 4: Same as level 2 but with 4 models.

Notes:

  • A description of the ensemble for your final model is available in the experiment log under Ensemble Base Model Fold Scores.

  • You can set the ensemble level manually in the Expert Settings panel with the Ensemble Level for Final Modeling Pipeline setting.