Import a dataset
Overview
H2O Hydrogen Torch lets you import your dataset using several supported data connectors.
H2O Hydrogen Torch requires your data to be formatted in a certain format base on the problem type the data aims to solve. To learn more, see Dataset formats.
Instructions
To import a dataset to H2O Hydrogen Torch, consider the following instructions:
In the H2O Hydrogen Torch navigation menu, click Import dataset.
In the Source list, select the source (data connector) that you want to use (for example, AWS S3).
- AWS S3
- Google Cloud Storage
- Kaggle
- Azure Datalake
- H2O Drive
- Upload
- In the S3 bucket name box, enter the name of the S3 bucket name.
- In the AWS access key box, enter the AWS access key. Note
You don't need to enter the AWS access key if the S3 bucket is public.
- In the AWS secret key box, enter the AWS secret key. Note
You don't need to enter the AWS secret key if the S3 bucket is public.
- In the File name list, select the file you want to use.
- In the GCS bucket name box, enter the name of the Google Cloud Storage bucket.
- In the GCS Service Account JSON box, enter the content of Google Cloud Service Account JSON file.
- In the File name list, select the file you want to use.
- In the Kaggle API command box, enter a Kaggle API command.
- In the Kaggle username box, enter your username.
- In the Kaggle secret key box, enter your kaggle secret key.
- In the Datalake connection string box, enter the Datalake connection string.
- In the Datalake container name box, enter the Datalake container name.
- In the File name box, enter the file name.
- In the File name list, select a dataset.
- Click Browse.
- Or drag and drop the file (dataset)
- Click Upload.
- Skip step 3.
Click Continue.
Define the import dataset settings according to the dataset's problem type.
Note- After importing a dataset through one of the supported data connectors for a problem type, H2O Hydrogen Torch automatically defines the dataset settings of the problem type by exploring the content of the imported dataset. Though, before saving the defined settings, you have the option to modify an erroneous value or option given to a dataset.
- Before a successful dataset import, you need to define a few dataset settings before the dataset can be imported and used for an experiment. The required dataset settings depend upon the structure and content of the dataset; in other words, it depends on the problem type the dataset aims to solve. To learn about the particular required import dataset settings for a supported problem type, see Import dataset settings.
Click Continue.
Again, click Continue.
NoteBefore you click Continue, please review the dataset preview.
To learn how to edit the settings of a saved (imported) dataset, see Edit dataset.
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai