Dataset connectors
Overview
H2O Hydrogen Torch provides a number of data connectors to access external data sources.
Supported dataset connectors
Note
- Each data connector requires either a single
.csv
file or the data to be in a.zip
file for a successful import. - The format of a dataset differs for different problem types. For more information, see Dataset formats.
- Before a successful dataset import, you need to specify a set of dataset settings before the dataset can be used for a given experiment. The required dataset settings differ upon the structure and content of the dataset. For more information, see Import dataset settings.
- For the S3 and Kaggle connector, you can save your AWS and Kaggle credentials in your H2O Hydrogen Torch instance to avoid the reenter of often used credentials. For more information, see App settings.
Upload (Standard upload feature)
The following parameter is required:
- File location
AWS S3 (Amazon AWS S3)
The following parameters are required:
- S3 bucket name
- AWS access key
- AWS secret key
- File name
Azure Data Lake (Microsoft Azure Data Lake Gen2)
The following parameters are required:
- Data lake connection string
- Data lake container name
- File name
Kaggle (Kaggle datasets )
The following parameters are required:
- Kaggle API command
- Kaggle username
- Kaggle secret key
H2O Drive (H2O.ai's data storage)
The following parameters are required:
- File name
Feedback
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai