Experiment Summary

An experiment summary is available for each completed experiment. Click the Download Experiment Summary button to download the h2oai_experiment_summary_<experiment>.zip file.

Download Experiment Summary

The files within the experiment summary zip provide textual explanations of the graphical representations that are shown on the Driverless AI UI. Details of each artifact are described below.

Experiment Autoreport

A report file (AutoDoc) is included in the experiment summary. This report provides insight into the training data and any detected shifts in distribution, the validation schema selected, model parameter tuning, feature evolution and the final set of features chosen during the experiment.

  • report.docx: the report available in Word format

Click here to download and view a sample experiment report in Word format.

Autoreport Support

Autoreport only supports resumed experiments for certain Driverless AI versions. See the following table to check what types of resumed experiments are supported for your version:

Autoreport Support for Resumed Experiments Via LTS 1.7.0 and older 1.7.1 1.8.0
New model with same parameters yes yes yes yes
Restart from last checkpoint no no yes yes
Retrain final pipeline no no no yes

Notes:

  • Autoreport does not support experiments that were built off of previously aborted or failed experiments.
  • Reports for unsupported resumed experiments will still build, but they will only include the following text: “AutoDoc not yet supported for resumed experiments.”

Experiment Overview Artifacts

The Experiment Summary contains artifacts that provide overviews of the experiment.

  • preview.txt: Provides a preview of the experiment. (This is the same information that was included on the UI before starting the experiment.)
  • summary.txt: Provides the same summary that appears in the lower-right portion of the UI for the experiment.
  • config.json: Provides a list of the settings used in the experiment.
  • args_do_auto_dl.json: The internal arguments used in the Driverless AI experiment based on the dataset and accuracy, time and interpretability settings.
  • experiment_column_types.json: Provides the column types for each column included in the experiment.
  • experiment_original_column.json: A list of all columns available in the dataset that was used in the experiment.
  • experiment_pipeline_original_required_columns.json: For columns used in the experiment, this includes the column name and type.
  • experiment_sampling_description.json: A description of the sampling performed on the dataset.
  • timing.json: The timing and number of models generated in each part of the Driverless AI pipeline.
  • train_data_summary.csv: A summary of the training dataset used in the experiment.

Tuning Artifacts

During the Driverless AI experiment, model tuning is performed to determined the optimal algorithm and parameter settings for the provided dataset. For regression problems, target tuning is also performed to determine the best way to represent the target column (i.e. does taking the log of the target column improve results). The results from these tuning steps are available in the Experiment Summary.

  • tuning_leaderboard: A table of the model tuning performed along with the score generated from the model and training time. (Available in txt or json.)
  • target_transform_tuning_leaderboard.txt: A table of the transforms applied to the target column along with the score generated from the model and training time. (This will be empty for binary and multiclass use cases.)

Features Artifacts

Driverless AI performs feature engineering on the dataset to determine the optimal representation of the data. The top features used in the final model can be seen in the GUI. The complete list of features used in the final model is available in the Experiment Summary artifacts.

The Experiment Summary also provides a list of the original features and their estimated feature importance. For example, given the features in the final Driverless AI model, we can estimate the feature importance of the original features.

Feature Feature Importance
NumToCatWoE:PAY_AMT2 1
PAY_3 0.92
ClusterDist9:BILL_AMT1:LIMIT_BAL:PAY_3 0.90

To calculate the feature importance of PAY_3, we can aggregate the feature importance for all variables that used PAY_3:

  • NumToCatWoE:PAY_AMT2: 1 * 0 (PAY_3 not used.)
  • PAY_3: 0.92 * 1 (PAY_3 is the only variable used.)
  • ClusterDist9:BILL_AMT1:LIMIT_BAL:PAY_3: 0.90 * 1/3 (PAY_3 is one of three variables used.)

Estimated Feature Importance = (1*0) + (0.92*1) + (0.9*(1/3)) = 1.22

Note: The feature importance is converted to relative feature importance. (The feature with the highest estimated feature importance will have a relative feature importance of 1).

  • ensemble_features: A complete list of all features used in the final model, a description of the feature, and the relative feature importance. (Available in txt, table, or json.)
  • features_orig: A list of the original features provided and an estimate of the relative feature importance of that original feature in the final model. (Available in txt or json.)

Final Model Artifacts

The Experiment Summary includes artifacts that describe the final model. This is the model that is used to score new datasets and create the MOJO scoring pipeline. The final model may be an ensemble of models depending on the Accuracy setting.

  • ensemble.txt: A summary of the final model which includes a description of the model(s), gains/lifts table, confusion matrix, and scores of the final model for our list of scorers.
  • ensemble_description.txt: A sentence describing the final model. (For example: “Final TensorFlowModel pipeline with ensemble_level=0 transforming 21 original features -> 54 features in each of 1 models each fit on full training data (i.e. no hold-out).”)
  • ensemble_model_description.json: A json file describing the model(s) and for ensembles how the model predictions are weighted.
  • ensemble_model_params.json: A json file decribing the parameters of the model(s).
  • ensemble_folds_data.json: A json file describing the folds used for the final model(s). This includes the size of each fold of data and the performance of the final model on each fold. (Available if a fold column was specified.)
  • ensemble_features_orig: A list of the original features provided and an estimate of the relative feature importance of that original feature in the ensemble of models. (Available in txt or json.)
  • ensemble_features: A complete list of all features used in the final ensemble of models, a description of the feature, and the relative feature importance. (Available in txt, table, or json.)

The Experiment Summary also includes artifacts about the final model performance.

  • ensemble_scores.json: The scores of the final model for our list of scorers.
  • ensemble_confusion_matrix: The confusion matrix for the internal validation and test data if test data is provided.
  • ensemble_confusion_matrix_stats_test.json: Confusion matrix statistics on the test data. (Only available if test data provided)
  • ensemble_gains: The lift and gains table for the internal validation and test data if test data is provided. (Visualization of lift and gains can be seen in the UI.)
  • ensemble_roc: The ROC and Precision Recall table for the internal validation and test data if test data is provided. (Visualization of ROC and Precision Recall curve can be seen in the UI.)