Import a dataset
Before you can import your dataset to H2O Label Genie, the dataset needs to meet the following requirements:
- The dataset data type needs to be text, image, or audio.
- Dataset format:
- The dataset (data) for an image or audio annotation task must be in a
.zip
file containing the images or audios.Note- You can have any nested folder structure inside the
.zip
file. - All images need to have an image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions.
- All audios need to have an audio extension. audios can contain a mix of supported audio extensions. To learn about supported audio extensions, see Supported audio extensions.
- You can have any nested folder structure inside the
- The dataset (data) for a text annotation task must be in a
.csv
file.- One column needs to hold the text data.
- The dataset (data) for an image or audio annotation task must be in a
- To learn about supported annotation tasks, see Supported annotation tasks.
- To learn how to annotate your dataset, see Create an annotation task.
- to learn how to import an annotated dataset, see Download an annotated dataset.
Instructions​
To import your dataset (data) to H2O Label Genie, consider the following instructions:
-
On the H2O Label Genie navigation menu, click Datasets.
-
Click Import data.
-
In the Name box, enter a name for the dataset.
-
(Optional) In the Description box, enter a description for the dataset.
-
For Data type, choose an option.
- If the data type of the dataset you are importing is text: Select Text.
- If the data type of the dataset you are importing is image: Select Image.
- If the data type of the dataset you are importing is audio: Select Audio.
-
In the Source list, select the source (data connector) that you want to use (e.g., S3).
- Upload
- S3
Click Browse....
Or drag and drop the file (dataset)
In the S3 bucket name box, enter the name of the S3 bucket name.
In the AWS access key box, enter the AWS access key.
In the AWS access key box, enter the AWS access key.
infoYou don't need to enter the AWS access key if the S3 bucket is public.
In the AWS secret key box, enter the AWS secret key.
infoYou don't need to enter the AWS secret key if the S3 bucket is public.
In the File name list, select the file you want to use.
-
Click Import.
Supported image extensions​
The following is a list of supported image extensions for image annotation tasks in H2O Label Genie:
- Windows bitmaps:
.bmp
- JPEG files:
.jpeg
,.jpg
,.jpe
- JPEG 2000 files:
.jp2
- Portable Network Graphics:
.png
- WebP:
.webp
- Portable image format:
.pbm
,.pgm
,.ppm
,.pnm
- TIFF files:
.tiff
.tif
- Radiance HDR:
.hdr
Supported audio extensions​
The following is a list of supported audio extensions for audio annotation tasks in H2O Label Genie:
- Uncompressed:
.wav
,.aiff
- Lossless compressed:
.flac
- Lossy compressed:
.mp3
,.ogg
- Submit and view feedback for this page
- Send feedback about H2O Label Genie to cloud-feedback@h2o.ai