Skip to main content
Version: v0.2.0

Import a dataset

Before you can import your dataset to H2O Label Genie, the dataset needs to meet the following requirements:

  1. The dataset data type needs to be text, image, or audio.
  2. Dataset format:
    • The dataset (data) for an image or audio annotation task must be in a .zip file containing the images or audios.
      Note
      • You can have any nested folder structure inside the .zip file.
      • All images need to have an image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions.
      • All audios need to have an audio extension. audios can contain a mix of supported audio extensions. To learn about supported audio extensions, see Supported audio extensions.
    • The dataset (data) for a text annotation task must be in a .csv file.
      • One column needs to hold the text data.
Note

Instructions

To import your dataset (data) to H2O Label Genie, consider the following instructions:

  1. On the H2O Label Genie navigation menu, click Datasets.

  2. Click Import data.

  3. In the Name box, enter a name for the dataset.

  4. (Optional) In the Description box, enter a description for the dataset.

  5. For Data type, choose an option.

    • If the data type of the dataset you are importing is text: Select Text.
    • If the data type of the dataset you are importing is image: Select Image.
    • If the data type of the dataset you are importing is audio: Select Audio.
  6. In the Source list, select the source (data connector) that you want to use (e.g., S3).

    1. Click Browse....

    2. Or drag and drop the file (dataset)
  7. Click Import.

Supported image extensions

The following is a list of supported image extensions for image annotation tasks in H2O Label Genie:

  • Windows bitmaps: .bmp
  • JPEG files: .jpeg, .jpg, .jpe
  • JPEG 2000 files: .jp2
  • Portable Network Graphics: .png
  • WebP: .webp
  • Portable image format: .pbm, .pgm , .ppm , .pnm
  • TIFF files: .tiff .tif
  • Radiance HDR: .hdr

Supported audio extensions

The following is a list of supported audio extensions for audio annotation tasks in H2O Label Genie:

  • Uncompressed: .wav, .aiff
  • Lossless compressed: .flac
  • Lossy compressed: .mp3, .ogg

Feedback