Transforming Another DatasetΒΆ

When a training dataset is used in an experiment, Driverless AI transforms the data into an improved, feature engineered dataset. (Refer to Driverless AI Transformations for more information about the transformations that are provided in Driverless AI.) But what happens when new rows are added to your dataset? In this case, you can specify to transform the new dataset after adding it to Driverless AI, and the same transformations that Driverless AI applied to the original dataset will be applied to these new rows.

Follow these steps to transform another dataset. Note that this functionality provides the pipeline (engineered features) of the best individual model of the experiment.

Note: Transform Another Dataset is not available for Time Series experiments.

  1. On the completed experiment page for the original dataset, click the Transform Dataset option under the Model Action tab.

    Transform another dataset
  2. Select the new training dataset that you want to transform. Note that this must have the same number of columns as the original dataset.

  3. In the Select drop down, specify a validation dataset to use with this dataset, or specify to split the training data. If you specify to split the data, then you also specify the split value (defaults to 25%) and the seed (defaults to 1234). Note: To ensure the transformed dataset respects the row order, choose a validation dataset instead of splitting the training data. Splitting the training data will result in a shuffling of the row order.

  4. Optionally specify a test dataset. If specified, then the output also include the final test dataset for final scoring.

  5. Click Launch Transformation.

    Transform another dataset

The following datasets will be available for download upon successful completion:

  • Training dataset (not for cross validation)

  • Validation dataset for parameter tuning

  • Test dataset for final scoring. This option is available if a test dataset was used.

    Transform complete