Tutorial 1B: Annotation task: Image classification
Overview​
This tutorial describes the process of annotating and specifying an annotation task rubric for an image classification annotation task. To highlight the process, we are going to annotate a dataset that contains images of cars and coffee. This tutorial also quickly explores how you can download the fully annotated dataset supported in H2O Hydrogen Torch.
Step 1: Explore dataset​
We are going to use the preloaded car-or-coffee-demo demo dataset for this tutorial. The dataset contains 40 images, each depicting a car or coffee. Let's quickly explore the dataset.
- On the H2O Label Genie navigation menu, click Datasets.
- In the datasets table, click car-or-coffee-demo.
Step 2: Create an annotation task​
Now that we have seen the dataset let's create an annotation task that enables you to annotate the dataset. An annotation task refers to the process of labeling data. For this tutorial, an image classification annotation task refers to assigning one or more categorical target labels to an input image. Let's create an annotation task.
- Click New annotation task.
- In the Task name box, enter
tutorial-1b
. - In the Task description box, enter
Annotate a dataset containing images of cars and coffee
. - In the Select task list, select Classification.
- Click Create task.
Step 3: Specify annotation task rubric​
Before we can start annotating our dataset, we need to specify an annotation task rubric. An annotation task rubric refers to the labels (for example, object classes) you want to use when annotating your dataset. For our dataset, there are two categorical target labels we want to specify, car and coffee. Let's define the annotation task rubric.
- In the New class name box, enter
Car
. - Click Add.
- Click Add class.
- In the New class name box, enter
Coffee
. - Click Add.
- Click Continue to annotate.
H2O Label Genie supports multi-label image classification annotation tasks.
Step 4: Annotate dataset​
Now that we have specified the annotation task rubric, let's annotate the dataset. In the Annotate tab, you can individually annotate each image in the dataset. Let's annotate the first image.
-
A zero-shot learning model is on by default when you annotate an image classification annotation task. The model accelerates the annotation (labeling) process by providing the percentage probability of an image (in this case, a car or coffee image) belonging to a certain label (one of the labels created in the Rubric tab).
You can immediately start annotating in the Annotate tab or wait until the zero-shot model is ready to provide annotation suggestions. H2O Label Genie notifies you to Refresh the instance when zero-shot predictions (suggestions) are available.
For example, after refreshing the instance in this tutorial, the model provided probabilities for each label.
Note- A zero-shot learning model is on by default when you annotate an image classification annotation task. The model accelerates the annotation (labeling) process by providing the percentage probability of an image (in this case, a car or coffee image) belonging to a certain label (one of the labels created in the Rubric tab).
- To learn about the utilized model for an image classification annotation task, see Zero-shot learning models: Image classification.
- During the annotation process of an image classification dataset, you can download generated zero-shot predictions (probabilities) in the Export tab. To download all generated zero-shot predictions, consider the following instructions:
caution
- If the Enable zero-shot predictions setting is turned Off, the zero-shot learning model utilized for an image classification annotation task is not available during the annotation process while preventing the generation of zero-shot predictions. To turn On the Enable zero-shot predictions setting, see Enable zero-shot predictions.
- The time it takes H2O Label Genie to generate zero-shot predictions depends on the computational resources of the instance.
- Click the Export tab.
- In the Export zero-shot predictions list, select Download ZIP. :::
- A zero-shot learning model is on by default when you annotate an image classification annotation task. The model accelerates the annotation (labeling) process by providing the percentage probability of an image (in this case, a car or coffee image) belonging to a certain label (one of the labels created in the Rubric tab).
-
Click Save and next.
Note- Save and next saves the annotated image
- To skip an image to annotate later: Click Skip.
- Skipped images (samples) reappear after all non-skipped images are annotated
- To download all annotated samples so far, consider the following instructions:
- Click the Export tab.
- In the Export approved samples list, select Download ZIP.
noteH2O Label Genie downloads a zip file containing the annotated dataset in a format that is supported in H2O Hydrogen Torch. To learn more, see Downloaded dataset formats: Image classification.
:::
Download annotated dataset​
After annotating all the images, you can download the dataset in a format that H2O Hydrogen Torch supports. Let's download the annotated dataset.
- In the Annotate tab, click Export approved samples.
- In the Export approved samples list, select Download ZIP.
Summary​
In this tutorial, we learned the process of annotating and specifying an annotation task rubric for an image classification annotation task. We also learned how to download a fully annotated dataset supported in H2O Hydrogen Torch.
Next​
To learn the process of annotating and specifying an annotation task rubric for other various annotation tasks in computer vision (CV), natural language processing (NLP), and audio, see Tutorials.
- Submit and view feedback for this page
- Send feedback about H2O Label Genie to cloud-feedback@h2o.ai