Driverless AI Experiment Setup Wizard

The Driverless AI Experiment Setup Wizard makes it simple for you to set up a Driverless AI experiment and ensure that the experiment’s settings are optimally configured for your specific use case. The Experiment Setup Wizard helps you learn about your data and lets you provide information about your use case that is used to determine the experiment’s settings. This Wizard covers topics such as data leakage, NLP handling, validation method, model reproducibility, and model deployment.

Notes:

  • This feature is currently in an experimental state.

  • A Dataset Join Wizard that makes it simple for you to join two datasets together is also available in Driverless AI. For more information, see Dataset Join Wizard.

The following sections describe how to access and use the Driverless AI Wizard.

Accessing the Driverless AI Wizard

Choose one of the following methods to access the Driverless AI Wizard:

  • On the Datasets page, click the name of the dataset you want to use for the experiment and select Predict Wizard from the list of options.

Predict Wizard dataset option
  • On the Experiments page, click the New Experiment button and select Wizard Setup. If this method is used, then the Driverless AI Wizard prompts you to select a dataset to use for the experiment.

    Wizard Setup option

Driverless AI Wizard sample walkthrough

The following example walks through the Driverless AI Wizard. Note that this walkthrough does not contain every possible step that the wizard offers.

  1. Select the option that best describes your role and specify how many years of experience you have with machine learning and data science. In this example, the options Data Scientist and <1 year are selected. Click Continue to proceed.

  2. Select a dataset. Select a tabular dataset with training data. Each row in the dataset must contain predictor variables (features) that can be used to predict the target column. In this example, the Rain in Australia dataset is selected.

  3. Select a problem type and target column. Specify a problem type and a target column for that problem type. Note that you can select a target column for only one of the available problem types. The goal in this example is to use the Rain in Australia dataset to predict next-day rain by training classification models, so RainTomorrow is specified as the target column in the Binary Classification section. Click Continue to proceed.

  4. Target column analysis. The Driverless AI Wizard provides information about the selected target column and prompts you to confirm that the target column looks as expected. Click Yes to proceed, or click No to return to the previous page and select a different column.

  5. Exclude columns. The Driverless AI Wizard prompts you to check for columns to drop from the experiment. Dropped columns are not used as predictors for the target column. If you already know which column(s) you want to drop, then you can click the Yes, I want to have a look button to select the column(s) you want to drop. If you don’t want to proceed without dropping any columns, click the No, don’t drop any columns button.

  6. Model deployment. The Driverless AI Wizard prompts you to specify the deployment scenario that you need to support. Select one of the following options. (Note that H2O MLOps supports all deployment artifacts, including Python, C++ MOJO, and Java MOJO. For more details, refer to the support matrix in the setup wizard.)

    • MLOps/Python: MLOps in H2O AI Cloud or standalone Python. Supported by all models.

    • Java MOJO: Low latency, standalone, runs anywhere. Only for some models.

    • C++ MOJO - Triton/Python/R: Low latency, standalone, easy integration. For most models.

Experiment setup wizard model deployment step
  1. Importance of time order. If your dataset contains at least one date or datetime column that doesn’t contain missing values, the Driverless AI Wizard prompts you to specify how important time order is to the experiment. In this example, the Time order doesn’t matter option is selected.

  2. Provide a test set. Specify a test set to use for the experiment. You can select an existing test set, create a test set from the training data, or skip this step entirely. To refresh the list of available datasets, click the Refresh dataset list button. In this example, the Create test set from training data option is selected.

  3. Split the training data. Use the slider to specify what fraction of the training dataset you want to use for testing. The Driverless AI Wizard automatically suggests a percentage based on the size of your training dataset. In this example, 15 percent of the training dataset is used for testing. Click Split my training data to proceed.

  4. Confirm the train / test split. The Driverless AI Wizard lists the following information for both the training and testing data based on the percentage specified in the preceding step:

    • The size of each dataset.

    • The number of rows and columns in each dataset.

    • Whether either dataset has any temporal order.

If this information looks as expected, click Yes to continue. Otherwise, click I need to make changes to return to step 8.

  1. Select a model type. Specify a model type based on settings for Accuracy, Time, and Interpretability, as well as training time and deployment size. You can also optionally specify whether you have strict runtime limits or if you want to limit the complexity of the model. In this example, the Keep it simple option is selected. Click Continue to proceed.

  2. Select a scorer. Specify a scorer to optimize. In this example, Area under ROC Curve (AUC) is selected. Click Continue to proceed.

  3. Experiment parameters. The Driverless AI Wizard lists all of the experiment parameters that have been configured up until this point. From this page, you can specify a name for the experiment and begin training, show additional details about the experiment (Python code and Expert Settings), or cancel the experiment and restart from the beginning of the wizard. In this example, Start Training is selected.

  4. The experiment now appears on the Experiments page in Driverless AI. You can view the progress of the experiment and click a link that takes you to the experiment in Driverless AI.

Driverless AI Wizard walkthrough