Expert Settings

This section describes the Expert Settings options that are available when starting an experiment. Note that the default values for these options are derived from the environment variables in the config.toml file. Refer to the Sample Config.toml File section for more information about each of these options.

Note that by default the feature brain pulls in any better model regardless of the features even if the new model disabled those features. For full control over features pulled in via changes in these Expert Settings, users should set the Feature Brain Level option to 0.

Expert settings

Upload Custom Recipe

Driverless AI supports the use of custom recipes (optional). If you have a custom recipe available on your local system, click this button to upload that recipe. If you do not have a custom recipe, you can select from a number of recipes available in the https://github.com/h2oai/driverlessai-recipes repository. Clone this repository on your local machine and upload the desired recipe. Refer to the Custom Recipes appendix for examples.

Load Custom Recipe from URL

If you have a custom recipe available on an external system, specify the URL for that recipe here. Note that this must point to the raw recipe file (for example https://raw.githubusercontent.com/h2oai/driverlessai-recipes/master/transformers/text_sentiment_transformer.py). Refer to the Custom Recipes appendix for examples.

General Settings

Approximate Max Runtime for Experiment

Specify the time limit in minutes for this experiment to run. This defaults to 0, which disables a time limit.

Pipeline Building Recipe

Specify the Pipeline Building recipe type. Auto (default) specifies that all models and features are automatically determined by experiment settings, config.toml settings, and the feature engineering effort. Compliant is similar to Auto except for the following:

  • Interpretability is forced to be 10.
  • Only use GLM or RuleFit.
  • Treat some numerical features as categorical. For instance, sometimes an integer column may not represent a numerical feature but represents different numerical codes instead.
  • Doesn’t use any ensemble.
  • No feature brain is used.
  • Interaction depth is set to 1.
  • Target transformer is forced to be identity for regression.
  • Doesn’t use distribution shift between train, valid, and test to drop features.

Feature Engineering Effort

Specify a value from 0 to 10 for the Driverless AI feature engineering effort. Higher values generally lead to more time (and memory) spent in feature engineering. This value defaults to 5.

  • 0: Keep only numeric features. Only model tuning during evolution.
  • 1: Keep only numeric features and frequency-encoded categoricals. Only model tuning during evolution.
  • 2: Similar to 1 but instead just no Text features. Some feature tuning before evolution.
  • 3: Similar to 5 but only tuning during evolution. Mixed tuning of features and model parameters.
  • 4: Similar to 5, but slightly more focused on model tuning.
  • 5: Balanced feature-model tuning. (Default)
  • 6-7: Similar to 5 but slightly more focused on feature engineering.
  • 8: Similar to 6-7 but even more focused on feature engineering with high feature generation rate and no feature dropping even if high interpretability.
  • 9-10: Similar to 8 but no model tuning during feature evolution.

Data Distribution Shift Detection

Specify whether Driverless AI should detect data distribution shifts between train/valid/test datasets (if provided). Currently, this information is only presented to the user and not acted upon.

Data Distribution Shift Detection Drop of Features

Specify whether to drop high-shift features. This defaults to Auto. Note that Auto for time series experiments turns this feature off.

Max Allowed Feature Shift (AUC) Before Dropping Feature

Specify the maximum allowed AUC value for a feature before dropping the feature.

When train and test differ (or train/valid or valid/test) in terms of distribution of data, then there can be a model built that tells you for each row whether the row is in train or test. That model includes an AUC value. If the AUC is above this specified threshold, then Driverless AI will consider it a strong enough shift to drop features that are shifted.

This option defaults to 0.6.

Leakage Detection

Specify whether to check leakage for each feature. Note that this is always disabled if a fold column is specified and if the experiment is a time series experiment.

Leakage Detection Dropping AUC/R2 Threshold

If Leakage Detection is enabled, specify to drop features for which the AUC (classification)/R2 (regression) is above this value. This option defaults to 0.999.

Make Python Scoring Pipeline

Specify whether to automatically build a Python Scoring Pipeline for the experiment. If enabled, then when the experiment is completed, the Python Scoring Pipeline can be immediately downloaded. If disabled, the Python Scoring Pipeline will have to be built separately after the experiment is complete.

Make MOJO Scoring Pipeline

Specify whether to automatically build a MOJO (Java) Scoring Pipeline for the experiment. If enabled, then when the experiment is completed, the MOJO Scoring Pipeline can be immediately downloaded. If disabled, the MOJO Scoring Pipeline will have to be built separately after the experiment is complete.

Min Number of Rows Needed to Run an Experiment

Specify the minimum number of rows that a dataset must contain in order to run an experiment. This value defaults to 100.

Max Number of Rows Times the Number of Columns for Feature Evolution Data Splits

Specify the maximum number of rows allowed for feature evolution data splits (not for the final pipeline). This value defaults to 100,000,000.

Max Number of Original Features Used

Specify the maximum number of features you want to be selected in an experiment. This defaults to 10000.

Max Allowed Fraction of Uniques for Integer and Categorical Columns

Specify the maximum fraction of unique values for integer and categorical columns. If the column has a larger fraction of unique values than that, it will be considered an ID column and ignored. This value defaults to 0.95.

Enable Imbalanced Sampling for Binary Classification

Quantile-based sampling method for imbalanced binary classification (only if class ratio is above the threshold provided above). Model on data is used to create deciles of predictions, and then each decile is sampled from uniformly.

Quantile-Based Imbalanced Sampling

Specify whether to enable the quantile-based sampling method for imbalanced binary classification. (Note: This is only applicable if the class ratio is above the imbalanced ratio undersampling threshold.) When enabled, model on data is used to create deciles of predictions, and then each decile is sampled from uniformly.

The idea behind quantile-based imbalanced sampling is that we do not want to just randomly down sample the majority class; instead we want to get an interesting representation of the majority class.

Here are the steps used to perform quantile-based imbalanced sampling:

  1. Train an initial model.
  2. Use the model from Step 1 to score each record in the majority class.
  3. Bin the majority class records based on their prediction.
  4. Randomly sample records from each bin.

If our use case was fraud, then quantile-based imbalanced sampling would sample not-fraud records based on the prediction of an initial model. This ensures that we have an even distribution of records that are easy to classify as not-fraud (low prediction bins) and records that are harder to classify as not-fraud (high prediction bins).

Feature Brain Level

H2O.ai Brain enables caching and smart re-use (checkpointing) of prior models to generate features for new models. Use this option to specify whether to use H2O.ai brain, the local caching and smart re-use of prior models to generate features for new models. This option essentially controls how much information is stored about the different models generated and the different features explored while running an experiment. It can help with checkpointing and retrieving experiments that have been paused or interrupted.

When enabled, this will use H2O.ai brain cache if:

  • the cache file has no extra column names per column type
  • the cache exactly matches classes, class labels, and time series options
  • interpretability of cache is equal or lower
  • the main model (booster) is allowed by the new experiment
  • -1: Don’t use any brain cache (default)
  • 0: Don’t use any brain cache but still write to cache. Use case: Want to save the model for later use, but we want the current model to be built without any brain models.
  • 1: Smart checkpoint if an old experiment_id is passed in (for example, via running “resume one like this” in the GUI). Use case: From the GUI, select prior experiments using the right-hand panel, and select “RESTART FROM LAST CHECKPOINT” to use a specific experiment’s model to build new models from.
  • 2: Smart checkpoint if the experiment matches all column names, column types, classes, class labels, and time series options identically. Use case: No need to select a particular prior experiment. We scan through the H2O.ai brain cache for the best models to restart from.
  • 3: Smart checkpoint like level #1, but for the entire population. Tune only if the brain population is of insufficient size. Note that this will re-score the entire population in a single iteration, so it appears to take longer to complete first iteration.
  • 4: Smart checkpoint like level #2, but for the entire population. Tune only if the brain population is of insufficient size. Note that this will re-score the entire population in a single iteration, so it appears to take longer to complete first iteration.
  • 5: Smart checkpoint like level #4, but will scan over the entire brain cache of populations (starting from resumed experiment if chosen) in order to get the best scored individuals. Note that this can be slower due to brain cache scanning if the cache is large.

When enabled, the directory where the H2O.ai Brain meta model files are stored is H2O.ai_brain. In addition, the default maximum brain size is 20GB. Both the directory and the maximum size can be changed in the config.toml file.

Feature Brain Save Every Which Iteration

Save feature brain iterations every iter_num % feature_brain_iterations_save_every_iteration == 0, to be able to restart/refit with which_iteration_brain >= 0. This is disabled (0) by default.

  • -1: Don’t use any brain cache.
  • 0: Don’t use any brain cache but still write to cache.
  • 1: Smart checkpoint if an old experiment_id is passed in (for example, via running “resume one like this” in the GUI).
  • 2: Smart checkpoint if the experiment matches all column names, column types, classes, class labels, and time series options identically. (default)
  • 3: Smart checkpoint like level #1, but for the entire population. Tune only if the brain population is of insufficient size.
  • 4: Smart checkpoint like level #2, but for the entire population. Tune only if the brain population is of insufficient size.
  • 5: Smart checkpoint like level #4, but will scan over the entire brain cache of populations (starting from resumed experiment if chosen) in order to get the best scored individuals.

When enabled, the directory where the H2O.ai Brain meta model files are stored is H2O.ai_brain. In addition, the default maximum brain size is 20GB. Both the directory and the maximum size can be changed in the config.toml file.

Feature Brain Restart from Which Iteration

When performing restart or re-fit of type feature_brain_level with a resumed ID, specify which iteration to start from instead of only last best. Available options include:

  • -1: Use the last best
  • 1: Run one experiment with feature_brain_iterations_save_every_iteration=1 or some other number
  • 2: Identify which iteration brain dump you wants to restart/refit from
  • 3: Restart/Refit from the original experiment, setting which_iteration_brain to that number here in expert settings.

Note: If restarting from a tuning iteration, this will pull in the entire scored tuning population and use that for feature evolution.

If our use case was fraud, then quantile-based imbalanced sampling would sample not-fraud records based on the prediction of an initial model. This ensures that we have an even distribution of records that are easy to classify as not-fraud (low prediction bins) and records that are harder to classify as not-fraud (high prediction bins).

Min DAI Iterations

Specify the minimum number of Driverless AI iterations for an experiment. This can be used during restarting, when you want to continue for longer despite a score not improving. This defaults to 0.

Max Number of Engineered Features

Specify the maximum number of features to include in the final model’s feature engineering pipeline. If -1 is specified (default), then Driverless AI will automatically determine the number of features.

Max Feature Interaction Depth

Specify the maximum number of features to be used for interaction features like grouping for target encoding, weight of evidence and other likelihood estimates.

Exploring feature interactions can be important in gaining better predictive performance. The interaction can take multiple forms (i.e. feature1 + feature2 or feature1 * feature2 + … featureN). Although certain machine learning algorithms (like tree-based methods) can do well in capturing these interactions as part of their training process, still generating them may help them (or other algorithms) yield better performance.

The depth of the interaction level (as in “up to” how many features may be combined at once to create one single feature) can be specified to control the complexity of the feature engineering process. Higher values might be able to make more predictive models at the expense of time. This value defaults to 8.

Select Target Transformation of the Target for Regression Problems

Specify whether to automatically select target transformation for regression problems. Selecting Identity disables any transformation. This value defaults to Auto.

Number of Cross-Validation Folds

Specify a fixed number of folds (if >= 2) for cross-validation.

Enable Target Encoding

Specify whether to use Target Encoding when building the model. Target encoding is the process of replacing a categorical value with the mean of the target variable. This is enabled by default.

Drop Constant Columns

Specify whether to drop columns with constant values. This is enabled by default.

Enable Detailed Scored Features Info

Specify whether to dump every scored individual’s variable importance (both derived and original) to a csv/tabulated/json file. If enabled, Driverless AI produces files such as “individual_scored_id%d.iter%d*features*”. This option is disabled by default.

Enable Detailed Scored Model Info

Specify whether to dump every scored individual’s model parameters to a csv/tabulated file. If enabled (default), Driverless AI produces files such as “individual_scored_id%d.iter%d*params*”.

Model Settings

Ensemble Level for Final Modeling Pipeline

Specify one of the following ensemble levels:

  • -1 = auto, based upon ensemble_accuracy_switch, accuracy, size of data, etc. (Default)
  • 0 = No ensemble, only final single model on validated iteration/tree count. Note that predicted probabilities will not be available. (Refer to the following FAQ.)
  • 1 = 1 model, multiple ensemble folds (cross-validation)
  • 2 = 2 models, multiple ensemble folds (cross-validation)
  • 3 = 3 models, multiple ensemble folds (cross-validation)
  • 4 = 4 models, multiple ensemble folds (cross-validation)

Number of Models During Tuning Phase

Specify the number of models to tune during pre-evolution phase. Specify a lower value to avoid excessive tuning, or specify a higher to perform enhanced tuning. This option defaults to -1 (auto).

XGBoost GBM Models

This option allows you to specify whether to build XGBoost models as part of the experiment (for both the feature engineering part and the final model). XGBoost is a type of gradient boosting method that has been widely successful in recent years due to its good regularization techniques and high accuracy.

XGBoost Dart Models

This option specifies whether to use XGBoost’s Dart method when building models for experiment (for both the feature engineering part and the final model).

GLM Models

This option allows you to specify whether to build GLM models (generalized linear models) as part of the experiment (usually only for the final model unless it’s used exclusively). GLMs are very interpretable models with one coefficient per feature, an intercept term and a link function.

LightGBM Models

This option allows you to specify whether to build LightGBM models as part of the experiment. LightGBM Models are the default models.

LightGBM Random Forest Models

Select auto, on, off, or only from this dropdown to specify whether to include LightGBM Random Forest models as part of the experiment.

TensorFlow Models

This option allows you to specify whether to build TensorFlow models as part of the experiment (usually only for text features engineering and for the final model unless it’s used exlusively). Enable this option for NLP experiments.

TensorFlow models are not yet supported by MOJOs (only Python scoring pipelines are supported).

RuleFit Models

This option allows you to specify whether to build RuleFit models as part of the experiment. Note that MOJOs are not yet supported (only Python scoring pipelines). Note that multiclass classification is not yet supported for RuleFit models. Rules are stored to text files in the experiment directory for now.

FTRL Models

This option allows you to specify whether to build Follow the Regularized Leader (FTRL) models as part of the experiment. Note that MOJOs are not yet supported (only Python scoring pipelines). FTRL supports binomial and multinomial classification for categorical targets, as well as regression for continuous targets.

Max Number of Trees/Iterations

Specify the upper limit on the number of trees (GBM) or iterations (GLM) for all tree models. This defaults to 3000. Depending on accuracy settings, a fraction of this limit will be used.

Reduction Factor for Number of Trees/Iterations During Feature Evolution

Specify the factor by which max_nestimators is reduced for tuning and feature evolution. This option defaults to 0.2. So by default, Driverless AI will produce no more than 0.2 * 3000 trees/iterations during feature evolution.

Max Learning Rate for Tree Models

Specify the maximum learning rate for tree models during feature engineering. Larger values can speed up feature engineering, but can hurt accuracy. This value defaults to 0.5.

Max Number of Epochs for TensorFlow/FTRL

When building TensorFlow or FTRL models, specify the maximum number of epochs to train models with (it might stop earlier). This value defaults to 10. This option is ignored if TensorFlow models and/or FTRL models is disabled.

Max Number of Rules for RuleFit

Specify the maximum number of rules to be used for RuleFit models. This defaults to -1, which specifies to use all rules.

Time Series Settings

Time Series Lag-Based Recipe

This recipe specifies whether to include Time Series lag features when training a model with a provided (or autodetected) time column. Lag features are the primary automatically generated time series features and represent a variable’s past values. At a given sample with time stamp \(t\), features at some time difference \(T\) (lag) in the past are considered. For example if the sales today are 300, and sales of yesterday are 250, then the lag of one day for sales is 250. Lags can be created on any feature as well as on the target. Lagging variables are important in time series because knowing what happened in different time periods in the past can greatly facilitate predictions for the future. More information about time series lag is available in the Time Series Use Case: Sales Forecasting section.

Lag

Probability to Create Non-Target Lag Features

Lags can be created on any feature as well as on the target. Specify a probability value for creating non-target lag features. This value defaults to 0.1.

Generate Holiday Features

For time-series experiments, specify whether to generate holiday features for the experiment. This option is enabled by default.

Time Series Lags Override

Specify a lag override value such as 7, 14, 21, etc.

Consider Time Groups Columns as Standalone Features

Specify whether to consider time groups columns as standalone features. This is disabled by default.

Always Group by All Time Groups Columns for Creating Lag Features

Specify whether to group by all time groups columns for creating lag features. This is enabled by default.

Generate Time-Series Holdout Predictions

Specify whether to create holdout predictions on training data using moving windows. This can be useful for MLI, but it will slow down the experiment.

NLP Settings

Threshold for String Columns to be Treated as Text

Specify the threshold value (from 0 to 1) for string columns to be treated as text (0.0 - text; 1.0 - string). This value defaults to 0.3.

Max TensorFlow Epochs for NLP

When building TensorFlow NLP features (for text data), specify the maximum number of epochs to train feature engineering models with (it might stop earlier). This value defaults to 2. This option is ignored if TensorFlow models is disabled.

Enable Word-Based CNN TensorFlow Models for NLP

Specify whether to use Word-based CNN TensorFlow models for NLP. This option is ignored if TensorFlow is disabled.

Enable Word-Based BiGRU TensorFlow Models for NLP

Specify whether to use Word-based BiG-RU TensorFlow models for NLP. This option is ignored if TensorFlow is disabled.

Enable Character-Based CNN TensorFlow Models for NLP

Specify whether to use Character-level CNN TensorFlow models for NLP. This option is ignored if TensorFlow is disabled.

Path to Pretrained Embeddings for TensorFlow NLP Models

Specify a path to pretrained embeddings for TensorFlow NLP models. For example, /path/on/server/to/file.txt

System Settings

Number of Cores to Use

Specify the number of cores to use for the experiment. Note that if you specify -1, then all available cores will be used. Lower values can reduce memory usage, but might slow down the experiment.

#GPUs/Experiment

Specify the number of GPUs to user per experiment. A value of -1 specifies to use all available GPUs. Must be at least as large as the number of GPUs to use per model (or -1).

#GPUs/Model

Specify the number of GPUs to user per model, with -1 meaning all GPUs per model. In all cases, XGBoost tree and linear models use the number of GPUs specified per model, while LightGBM and Tensorflow revert to using 1 GPU/model and run multiple models on multiple GPUs.

Note: FTRL does not use GPUs. Rulefit uses GPUs for parts involving obtaining the tree using LightGBM.

GPU Starting ID

Specify Which gpu_id to start with. If using CUDA_VISIBLE_DEVICES=… to control GPUs (preferred method), gpu_id=0 is the first in that restricted list of devices. For example, if CUDA_VISIBLE_DEVICES='4,5' then gpu_id_start=0 will refer to device #4.

From expert mode, to run 2 experiments, each on a distinct GPU out of 2 GPUs, then:

  • Experiment#1: num_gpus_per_model=1, num_gpus_per_experiment=1, gpu_id_start=0
  • Experiment#2: num_gpus_per_model=1, num_gpus_per_experiment=1, gpu_id_start=1

From expert mode, to run 2 experiments, each on a distinct GPU out of 8 GPUs, then:

  • Experiment#1: num_gpus_per_model=1, num_gpus_per_experiment=4, gpu_id_start=0
  • Experiment#2: num_gpus_per_model=1, num_gpus_per_experiment=4, gpu_id_start=4

To run on all 4 GPUs/model, then

  • Experiment#1: num_gpus_per_model=4, num_gpus_per_experiment=4, gpu_id_start=0
  • Experiment#2: num_gpus_per_model=4, num_gpus_per_experiment=4, gpu_id_start=4

If num_gpus_per_model!=1, global GPU locking is disabled. This is because the underlying algorithms do not support arbitrary gpu ids, only sequential ids, so be sure to set this value correctly to avoid overlap across all experiments by all users.

More information is available at: https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker#gpu-isolation Note that gpu selection does not wrap, so gpu_id_start + num_gpus_per_model must be less than the number of visibile GPUs.

Enable Detailed Traces

Specify whether to enable detailed tracing in Driverless AI trace when running an experiment. This is disabled by default.

Custom Recipes Settings

Include Specific Transformers

Select the transformer(s) that you want to use in the experiment.

Include Specific Models

Specify the type(s) of models that you want Driverless AI to build in the experiment.

Include Specific Scorers

Specify the scorer(s) that you want Driverless AI to include when running the experiment.

Whether to Skip Failures of Transformers

Specify whether to avoid failed transformers. This is enabled by default.

Whether to Skip Failures of Models

Specify whether to avoid failed models. Failures are logged according to the specified level for logging skipped failures. This is enabled by default.

Level to Log for Skipped Failures

Specify one of the following levels for the verbosity of log failure messages for skipped transformers or models:

  • 0 = Log simple message
  • 1 = Log code line plus message (Default)
  • 2 = Log detailed stack traces

Other Settings

Add to config.toml via toml String

Specify any additional configuration overrides from the config.toml file that you want to include in the experiment. (Refer to the Sample Config.toml File section to view options that can be overridden during an experiment.) Setting this will override all other settings. Separate multiple config overrides with \n. For example, the following enables Poisson distribution for LightGBM and disables Target Transformer Tuning. Note that in this example double quotes are escaped (\" \").

params_lightgbm=\"{'objective':'poisson'}\" \n target_transformer=identity

Or you can specify config overrides similar to the following without having to escape double quotes:

""enable_glm="off" \n enable_xgboost="off" \n enable_lightgbm="off" \n enable_tensorflow="on"""
""max_cores=10 \n data_precision="float32" \n max_rows_feature_evolution=50000000000 \n ensemble_accuracy_switch=11 \n feature_engineering_effort=1 \n target_transformer="identity" \n tournament_feature_style_accuracy_switch=5 \n params_tensorflow="{'layers': [100, 100, 100, 100, 100, 100]}"""

When running the Python client, config overrides would be set as follows:

model = h2o.start_experiment_sync(
    dataset_key=train.key,
    target_col='target',
    is_classification=True,
    accuracy=7,
    time=5,
    interpretability=1,
    config_overrides="""
                     feature_brain_level=0
                     enable_lightgbm="off"
                     enable_xgboost="off"
                     enable_ftrl="off"
                     """
)

Reproducibility Level

Specify one of the following levels of reproducibility (note that this setting is only active while reproducible mode is enabled):

  • 1 = Same experiment results for same O/S, same CPU(s), and same GPU(s) (Default)
  • 2 = Same experiment results for same O/S, same CPU architecture, and same GPU architecture
  • 3 = Same experiment results for same O/S, same CPU archicture (excludes GPUs)
  • 4 = Same experiment results for same O/S (best approximation)

Random Seed

Specify a random seed for the experiment. When a seed is defined and the reproducible button is enabled (not by default), the algorithm will behave deterministically.

Enable Detailed Scored Features Info

Specify whether to copy every scored individual’s variable importance (both derived and original) to a CSV, tabulated, or JSON file. This is disabled by default.

Enable Detailed Scored Model Info

Specify whether to copy every scored individual’s model parameters to a CSV, tabulated, or JSON file. This is enabled by default.