Version: v1.4.0

Dataset format: Image semantic segmentation

Formats
Examples
Helper functions
Format conversions

H2O Hydrogen Torch supports several dataset (data) formats for an image semantic segmentation experiment. Supported formats are as follows:

Hydrogen Torch format
COCO format

The data following the Hydrogen Torch format for an image semantic segmentation experiment is structured as follows: A zip file (1) containing a Parquet file (2) and an image folder (3):

folder_name.zip (1)
│   └───pq_name.pq (2)
│   │
│   └───image_folder_name (3)
│       └───name_of_image.image_extension
│       └───name_of_image.image_extension
│       └───name_of_image.image_extension
│       ...

Note

You can have multiple Parquet files in the zip file that you can use as train, validation, and test dataframes:

A train Parquet file needs to follow the format described above
A validation Parquet file needs to follow the same format as a train Parquet file
A test Parquet file needs to follow the same format as a train Parquet file, but does not need a class_id and rle_mask column

The available dataset connectors require the data for an image semantic segmentation experiment to be in a zip file.
Note
To learn how to upload your zip file as your dataset in H2O Hydrogen Torch, see Dataset connectors.
A Parquet file containing the following columns:
- An image column containing the names of the images for the experiment, where each image has an image extension specified
  Note
  - Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
  - The names of the image files do not specify the data directory (location of the images in the zip file). You can specify the data directory (data folder) when uploading the dataset or before the dataset is used for an experiment. For more information, see Import dataset settings.
- A class_id column containing the class names of each mask. Each row of the dataset should contain a list of all possible class names
- A rle_mask column containing run-length-encoded (RLE) masks for each class from the class_id column. If there is no mask for a given class, an empty string has to be provided
  Note
  The length of each class_id and rle_mask list must be equal while referring to the total number of classes.
- An optional fold column containing cross-validation fold indexes
  Note
  The fold column can include integers (0, 1, 2, … , N-1 values or 1, 2, 3… , N values) or categorical values.
An image folder that contains all the images specified in the image column; H2O Hydrogen Torch uses the images in this folder to run the image semantic segmentation experiment.
Note
All image file names need to specify image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.

The data following the COCO format for an image semantic segmentation experiment is structured as follows: A zip file (1) containing a JSON file (2) and an image folder (3):

folder_name.zip (1)
│   └───json_name.json (2)
│   │
│   └───image_folder_name (3)
│       └───name_of_image.image_extension
│       └───name_of_image.image_extension
│       └───name_of_image.image_extension
│       ...

Note

You can have multiple JSON files in the zip file that you can use as train, validation, and test datasets:

A train JSON file needs to follow the format described above
A validation JSON file needs to follow the same format as a train JSON file
A test JSON file needs to follow the same format as a train JSON file, but does not require labels

The available dataset connectors require the data for an image semantic segmentation experiment to be in a zip file.
Note
To learn how to upload your zip file as your dataset in H2O Hydrogen Torch, see Dataset connectors.
A JSON file that contains labels in a COCO format.
A folder containing all the image specified in the JSON file; H2O Hydrogen Torch uses the images in this folder during an image semantic segmentation experiment.

Hydrogen Torch format

The fashion_image_semantic_segmentation.zip file is a preprocessed dataset in H2O Hydrogen Torch and was formatted following the Hydrogen Torch format to solve an image semantic segmentation problem. The structure of the dataset is as follows:

fashion_image_semantic_segmentation.zip
│   └───train.pq
│   │
│   └───images
|       └───img_0458.png
|       └───img_0604.png    
│       └───img_0668.png
│           ...

As follows, three random rows from the Parquet file:

image	class_id	rle_mask
img_0458.png	['shoes' 'pants' 'dress' 'coat' 'shirt']	['180629 7 181447 17...
img_0604.png	['shoes' 'pants' 'dress' 'coat' 'shirt']	['189672 2 190493 9...
img_0668.png	['shoes' 'pants' 'dress' 'coat' 'shirt']	['108023 11 108848 11...

Note

In this example, the data directory in the image column is not specified. Therefore, it needs to be specified when uploading the dataset, and the images folder needs to be selected as the value for the Data folder setting. For more information, see Import dataset settings.
To learn how to access one of the preprocessed datasets in H2O Hydrogen Torch, see Demo (preprocessed) datasets.

Details

RLE encoding and decoding functions

from typing import Tuple

import numpy as np


def mask2rle(x: np.ndarray) -> str:
    """
    Converts input masks into RLE-encoded strings.

    Args:
        x: numpy array of shape (height, width), 1 - mask, 0 - background
    Returns:
        RLE string
    """

    pixels = x.T.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return " ".join(str(x) for x in runs)


def rle2mask(mask_rle: str, shape: Tuple[int, int]) -> np.ndarray:
    """
    Converts RLE-encoded string into the binary mask.

    Args:
        mask_rle: RLE-encoded string
        shape: (height,width) of array to return
    Returns:
        binary mask: 1 - mask, 0 - background
    """

    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0] * shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape(shape, order="F")  # Needed to align to RLE direction

Details

CSV file with masks to Hydrogen Torch format

import pandas as pd

df = pd.read_csv("/data/train.csv")

# Prepare the processed dataset
df = df.groupby(["image_id"]).agg(lambda x: x.to_list()).reset_index()

df[["image_id", "class_id", "rle_mask"]].to_parquet(
    "/data/train.pq", engine="pyarrow", index=False
)

Details

COCO to Hydrogen Torch format

import json
import pandas as pd
from pycocotools.coco import COCO


def get_semantic_segmentation(df, coco_path):
    coco = COCO(coco_path)

    images = images[["id", "file_name"]].drop_duplicates()
    images.columns = ["image_id", "file_name"]

    categories = categories[["id", "name"]].drop_duplicates()
    categories.columns = ["category_id", "name"]
    # Filter out _background_ class
    categories = categories[categories.name != "_background_"]

    all_labels = [
    pd.DataFrame({"file_name": x, "name": categories.name.unique()})
    for x in images.file_name.unique()
    ]
    all_labels = pd.concat(all_labels)
    all_labels = all_labels.merge(images).merge(categories).reset_index(drop=True)

    rles = []
    for idx, row in all_labels.iterrows():
        yield data_split, idx / len(all_labels)
        semantic_annotations = [
            x
            for x in df["annotations"]
            if x["image_id"] == row["image_id"]
            and int(x["category_id"]) == row["category_id"]
        ]

        if len(semantic_annotations) == 0:
            rles.append("")
            continue
        semantic_mask = np.max(
            [coco.annToMask(x) for x in semantic_annotations], axis=0
        )
        # mask2rle() is defined in "Helper functions" section
        rles.append(mask2rle(semantic_mask))

    all_labels["rle_mask"] = rles

    return all_labels


# Read data
train_path = "/data/COCO_train_annos.json"
with open(train_path, "r") as fp:
    train = json.load(fp)

# Parse COCO format
train_ann = get_semantic_segmentation(df=train, coco_path=train_path)

# Prepare the processed dataset
train_ann = train_ann.groupby(["file_name"]).agg(lambda x: x.to_list()).reset_index()
train_ann[["file_name", "name", "rle"]].to_parquet(
    "/data/train.pq", engine="pyarrow", index=False
)

Feedback

Submit and view feedback for this page
Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai