Import a dataset
Before you can import your dataset to H2O Label Genie, the dataset needs to meet the following requirements:
- The dataset data type needs to be text, image, or audio.
- Dataset format:
- The dataset (data) for an image or audio annotation task must be in a
.zip
file containing the images or audios.Note- You can have any nested folder structure inside the
.zip
file. - All images need to have an image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions.
- All audios need to have an audio extension. audios can contain a mix of supported audio extensions. To learn about supported audio extensions, see Supported audio extensions.
- You can have any nested folder structure inside the
- The dataset (data) for a text annotation task must be in a
.csv
file.- One column needs to hold the text data.
- The dataset (data) for an image or audio annotation task must be in a
- To learn about supported annotation tasks, see Supported annotation tasks.
- To learn how to annotate your dataset, see Create an annotation task.
- to learn how to import an annotated dataset, see Download an annotated dataset.
Instructions
To import your dataset (data) to H2O Label Genie, consider the following instructions:
On the H2O Label Genie navigation menu, click Datasets.
Click Import data.
In the Name box, enter a name for the dataset.
(Optional) In the Description box, enter a description for the dataset.
For Data type, choose an option.
- If the data type of the dataset you are importing is text: Select Text.
- If the data type of the dataset you are importing is image: Select Image.
- If the data type of the dataset you are importing is audio: Select Audio.
In the Source list, select the source (data connector) that you want to use (e.g., S3).
- Upload
- S3
- Click Browse....
Or drag and drop the file (dataset)- In the S3 bucket name box, enter the name of the S3 bucket name.
- In the AWS access key box, enter the AWS access key.
- In the AWS access key box, enter the AWS access key.info
You don't need to enter the AWS access key if the S3 bucket is public.
- In the AWS secret key box, enter the AWS secret key.info
You don't need to enter the AWS secret key if the S3 bucket is public.
- In the File name list, select the file you want to use.
Click Import.
Supported image extensions
The following is a list of supported image extensions for image annotation tasks in H2O Label Genie:
- Windows bitmaps:
.bmp
- JPEG files:
.jpeg
,.jpg
,.jpe
- JPEG 2000 files:
.jp2
- Portable Network Graphics:
.png
- WebP:
.webp
- Portable image format:
.pbm
,.pgm
,.ppm
,.pnm
- TIFF files:
.tiff
.tif
- Radiance HDR:
.hdr
Supported audio extensions
The following is a list of supported audio extensions for audio annotation tasks in H2O Label Genie:
- Uncompressed:
.wav
,.aiff
- Lossless compressed:
.flac
- Lossy compressed:
.mp3
,.ogg
- Submit and view feedback for this page
- Send feedback about H2O Label Genie to cloud-feedback@h2o.ai