Transform Another Dataset

When a training dataset is used in an experiment, Driverless AI transforms the data into an improved, feature engineered dataset. (Refer to Driverless AI Transformations for more information about the transformations that are provided in Driverless AI.) But what happens when new rows are added to your dataset? In this case, you can specify to transform the new dataset after adding it to Driverless AI, and the same transformations that Driverless AI applied to the original dataset will be applied to these new rows.

Follow these steps to transform another dataset. Note that this assumes the new dataset has been added to Driverless AI already.

Note: Transform Another Dataset is not available for Time Series experiments.

  1. On the completed experiment page for the original dataset, click the Transform Another Dataset button.

  2. Select the new training dataset that you want to transform. Note that this must have the same number columns as the original dataset.

  3. In the Select drop down, specify a validation dataset to use with this dataset, or specify to split the training data. If you specify to split the data, then you also specify the split value (defaults to 25%) and the seed (defaults to 1234). Note: To ensure the transformed dataset respects the row order, choose a validation dataset instead of splitting the training data. Splitting the training data will result in a shuffling of the row order.

  4. Optionally specify a test dataset. If specified, then the output also include the final test dataset for final scoring.

  5. Click Launch Transformation.

    Transform another dataset

The following datasets will be available for download upon successful completion:

  • Training dataset (not for cross validation)

  • Validation dataset for parameter tuning

  • Test dataset for final scoring. This option is available if a test dataset was used.

    Transform complete