Skip to main content
Version: v1.3.0

Import a dataset


H2O Hydrogen Torch requires your data to be formatted in a certain format base on the problem type the data aims to solve. To learn more, see Dataset formats.


To import a dataset to H2O Hydrogen Torch, consider the following instructions:

  1. In the H2O Hydrogen Torch navigation menu, click Import dataset.

  2. In the Source list, select the source (data connector) that you want to use (for example, AWS S3).

    1. In the S3 bucket name box, enter the name of the S3 bucket name.
    2. In the AWS access key box, enter the AWS access key.

      You don't need to enter the AWS access key if the S3 bucket is public.

    3. In the AWS secret key box, enter the AWS secret key.

      You don't need to enter the AWS secret key if the S3 bucket is public.

    4. In the File name list, select the file you want to use.
  3. Click Continue.

  4. Define the import dataset settings according to the dataset's problem type.

    • After importing a dataset through one of the supported data connectors for a problem type, H2O Hydrogen Torch automatically defines the dataset settings of the problem type by exploring the content of the imported dataset. Though, before saving the defined settings, you have the option to modify an erroneous value or option given to a dataset.
    • Before a successful dataset import, you need to define a few dataset settings before the dataset can be imported and used for an experiment. The required dataset settings depend upon the structure and content of the dataset; in other words, it depends on the problem type the dataset aims to solve. To learn about the particular required import dataset settings for a supported problem type, see Import dataset settings.
  5. Click Continue.

  6. Again, click Continue.


    Before you click Continue, please review the dataset preview.


To learn how to edit the settings of a saved (imported) dataset, see Edit dataset.