Using AutoDoc

The following sections describe Driverless AI’s AutoDoc feature.

Understanding AutoDoc

The AutoDoc feature is used to generate automated machine learning documentation for individual Driverless AI experiments. This editable document contains an overview of the experiment and includes other significant details like feature engineering and final model performance.

To download and view a sample experiment report in Word format, click here.

AutoDoc Support

AutoDoc only supports resumed experiments for certain Driverless AI versions. See the following table to check the types of resumed experiments that are supported for your version:

AutoDoc Support for Resumed Experiments Via

1.7.0 and older

1.7.1

1.9.0 and later

New experiment with same settings

yes

yes

yes

Restart from last checkpoint

no

yes

yes

Retrain final pipeline

no

no

yes

注解

  • To ensure that AutoDoc pipeline visualizations are generated correctly on native installations, installing fontconfig is recommended.

  • AutoDoc does not support experiments that were built off of previously aborted or failed experiments.

  • Reports for unsupported resumed experiments will still build, but they will only include the following text: “AutoDoc not yet supported for resumed experiments.”

Custom AutoDocs

All Driverless AI experiments can generate either a standard or custom AutoDoc. A standard AutoDoc uses the default AutoDoc template that is included with Driverless AI, while a custom AutoDoc uses a customer-specific template that Driverless AI automatically populates.

If you are interested in creating a custom AutoDoc, contact support@h2o.ai. If you have already purchased a custom AutoDoc template and want to learn how to generate custom AutoDocs from your experiments, see Generating a Custom AutoDoc.

注解

BYOR Recipes with AutoDoc

The experiment AutoDoc supports experiments that use custom scorers, transformers, or models. Custom scorers and transformers are documented in the same manner as Driverless AI scorers and transformers. If Driverless AI used a custom transformer, it is included in the Feature Transformations table under its display name; otherwise, it is only included in the Feature Evolution section. (Note: Custom transformer descriptions are currently shown as “None” in this section.) For custom models, the standard performance metrics and plots are included; however, information that Driverless AI cannot access is not included, or is shown as “custom”, “unavailable”, or “auto.” For example, in the Model Tuning table, the booster is listed as “custom”, and in the Alternative Models section, the model package documentation is listed as “unavailable.”

Final model performance calculation in AutoDoc

The Performance of Final Model section of the experiment AutoDoc contains a Performance Table that includes the following columns:

Final ensemble standard deviation on validation column: This column includes the standard deviation of the scores across all validation folds. The standard deviation measures the model’s performance variability across folds to indicate prediction consistency.

Final ensemble scores on validation column: This column represents the average score value derived from all validation folds used in the cross-validation process. Note that the number of folds can vary based on the fixed_num_folds and fixed_fold_reps configuration settings.

Note that standard deviation is calculated whenever a sampled dataset avg score is calculated. If it’s an internal holdout, the STD will be across the folds. If it’s external holdout (that is, if you provide an explicit validation file), it will be based on that file.

Generating an AutoDoc

Three different approaches can be used to generate an AutoDoc:

Notes:

  • For more information on how to configure plots/tables and enable/disable specific sections in the AutoDoc, see Configuring AutoDoc.

  • These approaches also apply to custom AutoDocs. For more information, see Generating a Custom AutoDoc.

Experiment UI

Navigate to the Experiments page and click on the completed experiment you want to generate an AutoDoc for.

If AutoDoc was not previously enabled for the experiment, click the Build AutoDoc button.

Build AutoDoc button

If AutoDoc was previously enabled for the experiment, click the Download AutoDoc button.

Download AutoDoc button

MLI UI

Navigate to the MLI page and click on the completed experiment you want to generate an AutoDoc for.

Select AutoDoc from the MLI RECIPES’s menu and optionally select explainers that can be included in the AutoDoc (the standard AutoDoc supports the k-LIME Explainer and DT Surrogate Explainer).

MLI select recipe

The Standard AutoDoc with Explainers:

MLI select recipe

Python Client

AutoDoc Functions

  • create_and_download_autodoc()

  • make_autodoc_sync()

For local downloads:

create_and_download_autodoc(
    model_key:str,
    template_path:str='',
    config_overrides:str='',
    dest_path:str='.',
    mli_key:str='',
    individual_rows:list=[],
    external_dataset_keys:list=[])

To save an AutoDoc to the DAI experiment directory (recommended if local downloads are disabled):

make_autodoc_sync(
    model_key:str,
    template_path:str='',
    config_overrides:str='',
    mli_key:str='',
    individual_rows:list=[],
    external_dataset_keys:list=[])
  • model_key: The experiment key string.

  • template_path: The full path the custom AutoDoc template.

  • config_overrides: The TOML string format with configurations overrides for the AutoDoc.

  • dest_path: The local path where the AutoDoc should be saved.

  • mli_key: The mli key string.

  • individual_rows: List of row indices for rows of interest in the training dataset, for which additional information can be shown (ICE, LOCO, KLIME).

  • external_dataset_keys: List of DAI dataset keys.

driverlessai

Connect to a running DAI instance:

import driverlessai
address = 'http://ip_where_driverless_is_running:12345'
username = 'username'
password = 'password'
dai = driverlessai.Client(address=address, username=username, password=username)

Generate an AutoDoc and download it to your current working directory:

report = dai._backend.create_and_download_autodoc(
    model_key=exp_key,
    dest_path:str='.',
)

Configuring AutoDoc

The plots, tables, and sections of an AutoDoc can be configured through four different workflows:

You can also configure the font for the AutoDoc plots by setting the H2O_AUTODOC_PLOTS_FONT_FAMILY environment variable.

Experiment Setup > Expert Settings window

The following steps describe how to access AutoDoc-related settings from the Experiment Setup page.

  1. On the experiment setup page, click Expert Settings. The Expert Settings window is displayed.

Open the Expert Settings window
  1. In the Expert Settings window, click Experiment Documentation.

  2. The General sub-tab contains the most commonly used AutoDoc settings. For advanced settings, see the Data, Models, Model Performance, and Interpretation sub-tabs.

Experiment documentation expert settings

Interpretation Settings > MLI Recipes (Explainers)

The following steps describe how to access AutoDoc-related settings from the Interpretation Settings page.

  1. On the Interpretation Settings page, click Recipes. The list of available MLI recipes (explainers) is displayed.

Interpretation settings > select recipes
  1. In the list of MLI recipes, enable the AutoDoc recipe by clicking AutoDoc.

  2. In the list of MLI recipes, click the gear icon next to AutoDoc. The available AutoDoc-related settings are displayed.

View AutoDoc explainer settings
AutoDoc explainer settings

Python Client

All configuration options for the AutoDoc are listed in the config.toml file. The following are several commonly used configuration parameters:

import toml

# Set the document to limit features displayed to the top ten
config_dict={
   "autodoc_num_features": 10
}

# Partial Dependence Plots (PDP) and ICE Plots
config_dict["autodoc_pd_max_runtime"] = 60
config_dict["autodoc_num_rows"] = 4

# Prediction statistics
config_dict["autodoc_prediction_stats"] = True
config_dict["autodoc_prediction_stats_n_quantiles"] = 10

# Population Stability Index (PSI)
config_dict["autodoc_population_stability_index"] = True
config_dict["autodoc_population_stability_index_n_quantiles"] = 10

# Permutation feature importance
config_dict["autodoc_include_permutation_feature_importance"] = True
config_dict["autodoc_feature_importance_scorer"] = "GINI"
config_dict["autodoc_feature_importance_num_perm"] = 1

# Response rates (only applicable to Binary classification)
config_dict["autodoc_response_rate"] = True
config_dict["autodoc_response_rate_n_quantiles"] = 10

toml_string = toml.dumps(config_dict)
print(toml_string)

After setting these parameters, generate an AutoDoc and download it to your current working directory:

driverlessai

report = dai._backend.create_and_download_autodoc(
    model_key=exp_key,
    config_overrides=config_overrides,
    dest_path:str='.',
)

Configuring AutoDoc font

The following sections describe how to configure the AutoDoc font environment variable.

Note: The following steps assume that DAI has been installed on an EC2 instance or an Ubuntu lab machine. These steps still apply if you are using H2O Enterprise Puddle to run a DAI instance—just log in to the EC2 instance where the DAI service is running using the provided SSH key.

If the DAI service has not been started

  1. Create an EC2 instance with enough memory and storage to run DAI.

  2. Install the font you want to use. In this example, the font TakaoPGothic is used.

sudo apt install fonts-takao-pgothic
  1. Create and install the DAI debian file.

wget https://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/dai/rel-1.10.3.1-10/x86_64/dai_1.10.3.1-1_amd64.deb
sudo dpkg -i dai_1.10.3.1-1_amd64.deb
  1. Set the font setting environment variable by adding the following line to the EnvironmentFile.conf file.

# either can set a font already installed system wide
H2O_AUTODOC_PLOTS_FONT_FAMILY=TakaoPGothic

# or provide a downloadable link of the preferred font
H2O_AUTODOC_PLOTS_FONT_FAMILY="https://h2o-data.s3.amazonaws.com/h2o-autodoc-data/fonts/TakaoPGothic.ttf"
  1. Start the DAI service.

sudo systemctl start dai

If the DAI service has already been started

  1. Ensure that the font is available on your system. In this example, the font TakaoPGothic is used.

fc-list | grep TakaoPGothic
  1. Stop the DAI service.

sudo systemctl stop dai
  1. Set the font setting environment variable by adding the following line to the EnvironmentFile.conf file.

# either can set a font already installed system wide
H2O_AUTODOC_PLOTS_FONT_FAMILY=TakaoPGothic

# or provide a downloadable link of the preferred font
H2O_AUTODOC_PLOTS_FONT_FAMILY="https://h2o-data.s3.amazonaws.com/h2o-autodoc-data/fonts/TakaoPGothic.ttf"
  1. Start the DAI service.

sudo systemctl start dai

Generating a Custom AutoDoc

This section describes how to generate an AutoDoc from a custom AutoDoc template. Choose from the following options:

  • Have Driverless AI use a custom AutoDoc for all experiments

  • Have Driverless AI generate a custom AutoDoc for an individual experiment

注解

Custom AutoDoc for All Experiments

To use a custom AutoDoc template, edit the config.toml file to point to the location of your custom AutoDoc. Use the following config.toml settings:

  • autodoc_template: Specify the path for the main template file.

  • autodoc_additional_template_folder: If you have additional custom sub-templates, use this setting to specify the location of additional AutoDoc templates. Note that if this field is left empty, only the default sub-templates folder is used.

To generate custom AutoDocs, Driverless AI must have access to the custom template(s). To make sure that Driverless AI has access, update the path in the following example with your own path:

autodoc_template="/full/path/to/your/custom_autodoc_template.docx"

# Required if you have additional custom sub-templates.
autodoc_additional_template_folder="/path/to/additional_templates_folder"

Custom AutoDoc for Individual Experiments

You can use the Python Client to generate standard or custom AutoDocs from an experiment by setting the template_path variable to your custom AutoDoc’s path:

template_path='/full/path/to/your/custom_autodoc_template.docx'

Python Client: driverlessai

report = dai._backend.create_and_download_autodoc(
    model_key=exp_key,
    template_path=template_path,
    dest_path:str='.',
)