Dataset format: Image object detection
- Formats
- Examples
- Format conversions
H2O Hydrogen Torch supports several dataset (data) formats for an image object detection experiment. Supported formats are as follows:
- Hydrogen Torch format
- Individual boxes format
- COCO format
- Pascal VOC format
The data following the Hydrogen Torch format for an image object detection experiment is structured as follows: A zip file (1) containing a Parquet file (2) and an image folder (3).
folder_name.zip (1)
│ └───pq_name.pq (2)
│ │
│ └───image_folder_name (3)
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ ...
You can have multiple Parquet files in the zip file that you can use as train, validation, and test dataframes:
- A train Parquet file needs to follow the format described above
- A validation Parquet file needs to follow the same format as a train Parquet file
- A test Parquet file needs to follow the same format as a train Parquet file, but does not require a class_id, x_min, x_max, y_min, and y_max column
- The available dataset connectors require the data for an image object detection experiment to be in a zip file. Note
To learn how to upload your zip file as your dataset in H2O Hydrogen Torch, see Dataset connectors.
- A Parquet file containing the following columns:
- An image column containing the names of the images for the experiment, where each image has an image extension specifiedNote
- Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
- The names of the image files do not specify the data directory (location of the images in the zip file). You can specify the data directory (data folder) when uploading the dataset or before the dataset is used for an experiment. For more information, see Import dataset settings.
- A class_id column containing the class names of each bounding box. Each row of the dataset should contain a list of class names, where each element in the list refers to a single box
- An x_min, x_max, y_min, and y_max column corresponding to the bounding box locations describing the spatial location of the objects. For each column, each row of the dataset should contain a list of coordinates, where each element in the list refers to a single boxNote
- The bounding box location is represented as a rectangular box, which is determined by the x and y coordinates of the upper-left and lower-right corners.
- The length of each list for the class_id, x_min, x_max, y_min, and y_max needs to be equal and needs to refer to the total number of bounding boxes in each respective image. If a box is not present for a given image, all lists need to be empty.
- An optional fold column containing cross-validation fold indexesNote
The fold column can include integers (0, 1, 2, … , N-1 values or 1, 2, 3… , N values) or categorical values.
- An image column containing the names of the images for the experiment, where each image has an image extension specified
- An image folder that contains all the images specified in the image column; H2O Hydrogen Torch uses the images in this folder to run the image object detection experiment. Note
All image file names need to specify image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
The data following the individual boxes format for an image object detection experiment is structured as follows: A zip file (1) containing a CSV file (2) and an image folder (3):
folder_name.zip (1)
│ └───csv_name.csv (2)
│ │
│ └───image_folder_name (3)
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ ...
You can have multiple CSV files in the zip file that you can use as train, validation, and test dataframes:
- A train CSV file needs to follow the format described above
- A validation CSV file needs to follow the same format as a train CSV file
- A test CSV file needs to follow the same format as a train CSV file, but does not require a class_id, x_min, x_max, y_min, and y_max column
- The available dataset connectors require the data for an image object detection to be in a zip file. Note
To learn how to upload your zip file as your dataset in H2O Hydrogen Torch, see Dataset connectors.
- A CSV file containing the following columns:
- An image column containing the names of the images for the experiment, where each image has an image extension specified Note
- Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
- The names of the image files do not specify the data directory (location of the images in the zip file). You can specify the data directory (data folder) when uploading the dataset or before the dataset is used for an experiment. For more information, see Import dataset settings.
- A class_id column containing the class names of each box. Each row of the dataset should contain a single box
- An x_min, x_max, y_min, and y_max column containing the bounding box locations describing the spatial location of the objects. For each column, each row of the dataset should contain a single coordinate value for a corresponding bounding boxNote
- The bounding box location is represented as a rectangular box, which is determined by the x and y coordinates of the upper-left and lower-right corners.
- If a box is not present for a given image, the column class_id, x_min, x_max, y_min, and y_max should be empty.
- An optional fold column containing cross-validation fold indexesNote
The fold column can include integers (0, 1, 2, … , N-1 values or 1, 2, 3… , N values) or categorical values.
- An image column containing the names of the images for the experiment, where each image has an image extension specified
- An image folder that contains all the images specified in the image column; H2O Hydrogen Torch uses the images in this folder to run the image object detection experiment. Note
All image file names need to specify image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
The data following the COCO format for an image object detection experiment is structured as follows: A zip file (1) containing a JSON file (2) and an image folder (3):
folder_name.zip (1)
│ └───json_name.json (2)
│ │
│ └───image_folder_name (3)
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ ...
You can have multiple JSON files in the zip file that you can use as train, validation, and test datasets:
- A train JSON file needs to follow the format described above
- A validation JSON file needs to follow the same format as a train JSON file
- A test JSON file needs to follow the same format as a train JSON file, but does not require labels
- The available dataset connectors require the data for an image object detection experiment to be in a zip file. Note
To learn how to upload your zip file as your dataset in H2O Hydrogen Torch, see Dataset connectors.
- A JSON file that contains labels in a COCO format.
- A folder containing all the images specified in the JSON file; H2O Hydrogen Torch uses the images in this folder to run the image object detection experiment. Note
All image file names need to specify image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
The data following the Pascal VOC format for an image object detection experiment is structured as follows: A zip file (1) containing a folder with XML files with labels (2) and an image folder (3):
folder_name.zip (1)
│ └───xml_folder_name (2)
│ └───name_of_image.xml
│ └───name_of_image.xml
│ └───name_of_image.xml
│ │
│ └───image_folder_name (3)
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ └───name_of_image.image_extension
│ ...
You can have multiple folders with labels in the zip file that you can use as train, validation, and test datasets:
- A train folder with labels needs to follow the format described above
- A validation folder with labels should have the same format as a train folder
- A test folder with labels should have the same format as a train folder, but labels are not required
- The available dataset connectors require the data for an image object detection experiment to be in a zip file. Note
To learn how to upload your zip file as your dataset in H2O Hydrogen Torch, see Dataset connectors.
- A folder that contains XML files with labels in a Pascal VOC format.
- An image folder that contains all the images specified in the XML files; H2O Hydrogen Torch uses the images in this folder to run the image object detection experiment. Note
All image file names need to specify image extension. Images can contain a mix of supported image extensions. To learn about supported image extensions, see Supported image extensions for image processing.
- Hydrogen Torch format
- Individual boxes format
The global_wheat_image_object_detection.zip
file is a preprocessed dataset in H2O Hydrogen Torch and was formatted following the Hydrogen Torch format to solve an image object detection problem. The structure of the zip file is as follows:
global_wheat_image_object_detection.zip
│ └───train.pq
│ │
│ └───images
│ └───7cca65c2cfb161be75fa41b754ef5263ee10e679dc8900f1fa75f845899abafc.jpg
│ └───3c6154081943882478110d2ea7ad0eef89cd954b6bd290d161385f9a5accc2fd.jpg
│ └───37a8db49093fd08a3be9ce48bbfb1a697b5da8dd51ac9fa53fc28d924888ace8.jpg
│ ...
As follows, three random rows from the Parquet file:
image | class_id | x_min | y_min | x_max | y_max |
---|---|---|---|---|---|
7cca65c2cfb161be75fa41b754ef5263ee10e679dc8900f1fa75f845899abafc.jpg | ['wheat' 'wheat' 'wheat' ...] | [689 718 382 ...] | [884 464 42 ...] | [754 768 450 ...] | [920 516 101 ...] |
3c6154081943882478110d2ea7ad0eef89cd954b6bd290d161385f9a5accc2fd.jpg | ['wheat' 'wheat' 'wheat' ...] | [924 698 904 ...] | [195 10 32 ...] | [981 763 938 ...] | [247 101 79 ...] |
37a8db49093fd08a3be9ce48bbfb1a697b5da8dd51ac9fa53fc28d924888ace8.jpg | ['wheat' 'wheat' 'wheat' ...] | [919 811 4 ...] | [535 820 96 ...] | [1024 912 71 ...] | [613 894 164 ...] |
- In this example, the data directory in the image column is not specified. Therefore, it needs to be specified when uploading the dataset, and the images folder needs to be selected as the value for the Data folder setting. For more information, see Import dataset settings.
- To learn how to access one of the preprocessed datasets in H2O Hydrogen Torch, see Demo (preprocessed) datasets.
image | x_min | y_min | x_max | y_max | class_id |
---|---|---|---|---|---|
bafc.jpg | 311 | 43 | 378 | 134 | wheat |
bafc.jpg | 276 | 83 | 354 | 153 | wheat |
bafc.jpg | 442 | 309 | 541 | 381 | wheat |
cryv.jpg | 301 | 13 | 328 | 124 | wheat |
cryv.jpg | 246 | 80 | 344 | 113 | wheat |
cryv.jpg | 432 | 303 | 341 | 181 | wheat |
Individual Boxes to Hydrogen Torch format
import pandas as pd
# Read data
df = pd.read_csv("/data/train.csv")
# Prepare the processed dataset
df = df.groupby(["image_id"]).agg(lambda x: [] if pd.isnull(x).all() else x.to_list()).reset_index()
df[["image_id", "class_id", "x_min", "y_min", "x_max", "y_max"]].to_parquet(
"/data/train.pq", engine="pyarrow", index=False
)
COCO to Hydrogen Torch format
import json
import pandas as pd
def get_object_detection(df):
images = pd.DataFrame(df["images"])
categories = pd.DataFrame(df["categories"])
annotations = pd.DataFrame(df["annotations"])
annotations["x_min"] = annotations["bbox"].map(lambda x: x[0]).astype(int)
annotations["y_min"] = annotations["bbox"].map(lambda x: x[1]).astype(int)
annotations["x_max"] = annotations["bbox"].map(lambda x: x[0] + x[2]).astype(int)
annotations["y_max"] = annotations["bbox"].map(lambda x: x[1] + x[3]).astype(int)
annotations = annotations[
["image_id", "category_id", "x_min", "y_min", "x_max", "y_max"]
]
annotations["category_id"] = annotations["category_id"].astype(int)
annotations = annotations.merge(
categories[["id", "name"]].drop_duplicates(), left_on="category_id", right_on="id", how="left"
)
annotations = annotations.merge(
images[["id", "file_name"]].drop_duplicates(), left_on="image_id", right_on="id", how="right"
)
annotations.drop(["id_x", "id_y", "image_id"], axis=1, inplace=True)
return annotations
# Read data
with open("/data/COCO_train_annos.json", "r") as fp:
train = json.load(fp)
# Parse COCO format
train_ann = get_object_detection(train)
# Prepare the processed dataset
train_ann = train_ann.groupby(["file_name"]).agg(lambda x: [] if pd.isnull(x).all() else x.to_list()).reset_index()
train_ann[["file_name", "name", "x_min", "y_min", "x_max", "y_max"]].to_parquet(
"/data/train.pq", engine="pyarrow", index=False
)
Pascal VOC to Hydrogen Torch format
import glob
import os
from xml.etree import ElementTree
import pandas as pd
from tqdm import tqdm
observations = []
for xml in tqdm(glob.glob("/data/Annotations/*.xml")):
tree = ElementTree.parse(xml)
root = tree.getroot()
objs = root.findall("object")
for obj in objs:
name = obj.find("name").text
bndbox = obj.find("bndbox")
xmin = float(bndbox.findtext("xmin")) - 1
ymin = float(bndbox.findtext("ymin")) - 1
xmax = float(bndbox.findtext("xmax"))
ymax = float(bndbox.findtext("ymax"))
try:
img_name = root.findall("path")[0].text.split("/")[-1]
except Exception:
img_name = root.findall("filename")[0].text
observations.append(
(
img_name,
name,
xmin,
ymin,
xmax,
ymax,
)
)
df = pd.DataFrame(
observations, columns=["image", "class_id", "x_min", "y_min", "x_max", "y_max"]
)
# Prepare the processed dataset
df = df.groupby(["image"]).agg(lambda x: [] if pd.isnull(x).all() else x.to_list()).reset_index()
df.to_parquet("/data/train.pq", engine="pyarrow", index=False)
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai