Dataset options

The following is a list of options that are available for every dataset on the Datasets page. To view these options, click Click for Actions next to any dataset listed on the Datasets page.

  • Details: View detailed information about the dataset. For more information, see Viewing dataset details.

  • Visualize: View a variety of visualizations generated by Driverless AI using the dataset. For more information, see Visualizing Datasets.

  • Summarize: Generate a summary of the dataset by leveraging Driverless AI’s integration with h2oGPT models. For more information, see h2oGPT integration. Note that in order to use this option, you must do the following:

    • Enable GPT functionality by setting enable_gpt=true. By default, this option is set to False.

    • Specify your h2oGPT endpoint with the h2ogpt_url configuration option. (Note that OpenAI can also be used if you explicitly opt-in to set this configuration option to OpenAI’s API key with openai_api_secret_key.)

    • For some h2oGPT URLs, specifying an h2oGPT key is required to enable authorized access for GPT-related tasks. To specify an h2oGPT key, set the h2ogpt_key configuration option.

    Dataset summary
  • Data Prep:

    • Split: Split the dataset into two subsets. For more information, see Split datasets.

    • Split by Time Wizard: Split a time series dataset into train and test sets by specifying an exact starting point in time for the test set. If the dataset has only a single time column, that time column is automatically selected for the time series split. If the dataset has multiple time columns, you can select which time column you want to use for the time series split.

    • Join Wizard: Opens the Driverless AI dataset join wizard. For more information, see Dataset Join Wizard.

    • Transform dataset: Opens a page that lets you edit specific values in a dataset. To confirm your changes, click the Save button. To undo your most recent change, click the Undo button. You can also reset all your changes by clicking the Reset button.

    • Apply Existing Data Recipe: Select a previously uploaded data recipe to apply to the dataset.

    • Live Code: Manually enter custom recipe code that is used to modify the dataset. Click the Get Preview button to preview the code’s effect on the dataset, then click Apply to create a new dataset. To download the entered code script as a Python file, click the Download button.

    • Data Recipe URL: Load a data recipe from a URL and apply it to the dataset.

    • Upload Data Recipe: Select a data recipe from your local file system to upload to Driverless AI.

    Note: For more information on modifying datasets, see Modifying Datasets.

  • Predict: Opens the Experiment Setup page and automatically specifies the selected dataset as the training dataset.

  • Predict Wizard: Opens the Driverless AI experiment setup wizard. For more information, see Driverless AI Experiment Setup Wizard.

  • Rename: Rename the dataset.

  • Download: Download the dataset to your local file system.

  • Display Logs: View logs relating to the dataset.

  • Delete: Delete the dataset from the list of datasets on the Datasets page.

Dataset options