Running an Experiment

  1. After Driverless AI is installed and started, open a browser (Chrome recommended) and navigate to <server>:12345.
  2. The first time you log in to Driverless AI, you will be prompted to read and accept the Evaluation Agreement. You must accept the terms before continuing. Review the agreement, then click I agree to these terms to continue.
  3. Log in by entering unique credentials. For example:
Username: h2oai
Password: h2oai

Note that these credentials do not restrict access to Driverless AI; they are used to tie experiments to users. If you log in with different credentials, for example, then you will not see any previously run experiments.

  1. As with accepting the Evaluation Agreement, the first time you log in, you will be prompted to enter your License Key. Click the Enter License button, then paste the License Key into the License Key entry field. Click Save to continue. This license key will be saved in the host machine’s /license folder.
Note: Contact for information on how to purchase a Driverless AI license.
  1. The Home page appears, showing all datasets that have been imported. Note that the first time you log in, this list will be empty. Add datasets using one of the following methods:
Drag and drop files from your local machine directly onto this page. Note that this method currently works for files that are less than 100 MB.


  1. Click the Add Dataset button.
Add Dataset
  1. In the Search for files field, type the location for the dataset. Note that Driverless AI autofills the browse line as type in the file location. When you locate the file, select it, then click the Click to Import Selection button at the top of the screen.
Search for file
  1. After importing your data, you can run an experiment by selecting [Click for Actions] button beside the dataset that you want to use. This opens a submenu that allows you to Visualize or Predict a dataset. (Note: You can delete unused datsets by hovering over it and clicking the X button. You cannot delete a dataset that was used in an active experiment. You have to delete the experiment first.) Click Predict to begin an experiment.
Datasets action menu
  1. The Experiment Settings form displays and auto-fills with the selected dataset. Optionally specify whether to drop any columns (for example, an ID column).
  2. Optionally specify a test dataset. Keep in mind that the test dataset must have the same number of columns as the training dataset.
  3. Optionally specify a Fold Column and/or a Weight Column and/or a Time Column. (Refer to the Experiment Settings section that follows for more information about these settings.)
  4. Specify the target (response) column. Note that not all explanatory functionality will be available for multinomial classification scenarios (scenarios with more than two outcomes).
Select target column
  1. When the target column is selected, Driverless AI automatically provides the target column type and the number of rows. If this is a classification problem, then the UI shows unique and frequency statistics for numerical columns. If this is a regression problem, then the UI shows the dataset mean and standard deviation values. At this point, you can configure the following experiment settings. Refer to the Experiment Settings section that follows for more information about these settings.
  • Desired relative accuracy from 1 to 10 (default: 5)

  • Desired relative time from 1 to 10 (default: 5)

  • Desired relative interpretability from 1 to 10 (default: 5)

  • Specify the scorer to use for this experiment. Available scorers include:

    • Regression: R2, MSE, RMSE, RMSLE, MAE, MAPE, GINI
    • Classification: GINI, AUC, MCC, F1, LOGLOSS

    If a scorer not selected, Driverless AI will select one based on the dataset and experiment.

Additional settings:

  • If this is a classification problem, then click the Classification button.
  • Click the Reproducible button to build this with a random seed.
  • Specify whether to enable GPUs. (Note that this option is ignored on CPU-only systems.)
Experiment settings
  1. Click Launch Experiment. This starts the experiment.

As the experiment runs, a running status displays in the upper middle portion of the UI. First Driverless AI figures out the backend and determines GPUs are running or not. Then it starts parameter tuning, followed by feature engineering. Finally, Driverless AI builds the scoring pipeline.

In addition to the status, the UI also displays details about the dataset, the iteration score (internal validation) for each cross validation fold along with any specified scorer value, the variable importance values, and CPU/Memory and GPU Usage information. Upon completion, an Experiment Summary section will populate in the lower right section.

You can stop experiments that are currently running. Click the Finish button to stop the experiment. This jumps the experiment to the end and completes the ensembling and the deployment package. You can also click Abort to terminate the experiment. Note that aborted experiments will not display on the Experiments page.


Experiment Settings

This section describes the settings that are available when running an experiment.

Dropped Columns

Dropped columns are columns that you do not want to be used as predictors in the experiment.

Test Data

Test data is used to create test predictions only. This dataset is not used for model scoring.

Weight Column

Optional: Column that indicates the observation weight (a.k.a. sample or row weight), if applicable. This column must be numeric with values >= 0. Rows with higher weights have higher importance. The weight affects model training through a weighted loss function, and affects model scoring through weighted metrics. The weight column is not used when making test set predictions (but scoring of the test set predictions can use the weight).

Fold Column

Optional: Column to use to create stratification folds during (cross-)validation, if applicable. Must be of integer or categorical type. Rows with the same value in the fold column represent cohorts, and each cohort is assigned to exactly one fold. This can help to build better models when the data is grouped naturally. If left empty, the data is assumed to be i.i.d. (identically and independently distributed).

Time Column

Optional: Column that provides a time order, if applicable. Can improve model performance and model validation accuracy for problems where the target values are auto-correlated with respect to the ordering. Each observation’s time stamp is used to order the observations in a causal way (generally, to avoid training on the future to predict the past). The values in this column must be a datetime format understood by pandas.to_datetime(), like “2017-11-29 00:30:35” or “2017/11/29”. If [AUTO] is selected, all string columns are tested for potential date/datetime content and considered as potential time columns. The natural row order of the training data is also considered in case no date/datetime columns are detected. If the data is (nearly) identically and independently distributed (i.i.d.), then no time column is needed. If [OFF] is selected, no time order is used for modeling, and data may be shuffled randomly (any potential temporal causality will be ignored).


The following table describes how the Accuracy value affects a Driverless AI experiment.

Accuracy Max Rows Ensemble Level Target Transformation Parameter Tuning Level Num Individuals CV Folds Only First CV Model Strategy
1 100K 0 False 0 Auto 3 True None
2 500K 0 False 0 Auto 3 True None
3 1M 0 False 0 Auto 3 True None
4 2.5M 0 False 0 Auto 3 True None
5 5M 1 True 1 Auto 3 True None
6 10M 1 True 1 Auto 3 True FS
7 20M 2 True 2 Auto 3 True FS
8 20M 2 True 2 Auto 4 False FS
9 20M 3 True 3 Auto 4 False FS
10 None 3 True 3 Auto 4 False FS

The list below includes more information about the parameters that are used when calculating accuracy.

  • Max Rows: The maximum number of rows to use in model training
  • For classification, stratified random sampling is performed
  • For regression, random sampling is performed
  • Ensemble Level: The level of ensembling done for the final model
  • 0: single model
  • 1: 2 4-fold models ensembled together
  • 2: 5 5-fold models ensembled together
  • 3: 8 5-fold models ensembled together
  • Target Transformation: Try target transformations and choose the transformation that has the best score
  • Possible transformations: identity, unit_box, log, square, square root, inverse, Anscombe, logit, sigmoid
  • Parameter Tuning Level: The level of parameter tuning done
  • 0: no parameter tuning
  • 1: 8 different parameter settings
  • 2: 16 different parameter settings
  • 3: 32 different parameter settings
  • Optimal model parameters are chosen based on a combination of the model’s accuracy, training speed and complexity.
  • Num Individuals: The number of individuals in the population for the genetic algorithms
  • Each individual is a gene. The more genes, the more combinations of features are tried.
  • The number of individuals is automatically determined and can depend on the number of GPUs. Typical values are between 4 and 16.
  • CV Folds: The number of cross validation folds done for each model
  • If the problem is a classification problem, then stratified folds are created.
  • Only First CV Model: Equivalent to splitting data into a training and testing set
  • Example: Setting CV Folds to 3 and Only First CV Model = True means you are splitting the data into 67% training and 33% testing.
  • Strategy: Feature selection strategy
  • None: No feature selection
  • FS: Feature selection based on permutations


This specifies the relative time for completing the experiment.

Time Epochs
1 10
2 20
3 30
4 40
5 50
6 100
7 150
8 200
9 300
10 500


Interpretability Strategy Monotonicity Constraints
<= 5 None Disabled
> 5 FS Disabled
>= 6 FS Enabled
  • Monotonicity Constraints:

    If enabled, the model will satisfy knowledge about monotonicity in the data and monotone relationships between the predictors and the target variable. For example, in house price prediction, the house price should increase with lot size and number of rooms, and should decrease with crime rate in the area. If enabled, Driverless AI will automatically determine if monotonicity is present and enforce it in its modeling pipelines.