Skip to main content
Version: v0.4.0

Zero-shot learning models

Overview

H2O Label Genie enables you to utilize zero-shot learning models to accelerate the labeling process. In particular, H2O Label Genie lets you use a zero-shot learning model for the following supported annotation tasks:

What are zero-shot learning models?

Labeled data is crucial for supervised learning problem types in computer vision (CV), natural language processing (NLP), and audio. High-quality labeled data usually requires a lot of manual labeling that can lead to high costs and delay production or execution.

One way to accelerate the labeling process is to utilize zero-shot learning models. These models let data scientists label unlabeled data with high accuracy and speed. Zero-shot learning models are pre-trained models that have been trained on vast and distinct classes. As a result, zero-shot learning models with prior knowledge can label unlabeled data.

What are zero-shot learning model predictions?

The labels or suggested labels for a given sample that are provided by a zero-shot learning model are called zero-shot learning model predictions. For example, for an image and text classification annotation task, H2O Label Genie, with a zero-shot learning model activated, offers a percentage probability of an image or text belonging to a certain label (class). For an object detection annotation task, it populates an image with bounding boxes where the bounding boxes capture the desired objects (for example, a car).

Annotation tasks + zero-shot learning models

Image classification

By default, H2O Label Genie utilizes the OpenCLIP zero-shot learning model for image classification annotation tasks. OpenCLIP is an adaptation of OpenAI's Contrastive Language-Image Pre-training (CLIP).

Object detection

By default, H2O Label Genie utilizes the Detic zero-shot learning model for object detection annotation tasks.

Image instance segmentation

By default, H2O Label Genie utilizes the Detic zero-shot learning model for image instance segmentation annotation tasks.

Text classification

By default, H2O Label Genie utilizes the bart-large-mnli zero-shot learning model for text classification annotation tasks.

Text summarization

H2O Label Genie allows you to utilize the following zero-shot learning models for text summarization annotation tasks:

note
  • OpenAI models: To utilize an OpenAI model, you need to define the OpenAI API settings linking to your OpenAI account. To learn more, see OpenAI API settings
  • Select a particular model: To learn how to select a particular zero-shot learning model for a text summarization annotation task, see Select a zero-shot learning model

Text-generative AI

H2O Label Genie allows you to utilize the following zero-shot learning models from the h2oGPT family for text-generative AI annotation tasks:

  • h2oGPT Llama2 7B
  • h2oGPT Llama2 13B
  • h2oGPT Llama2 70B
  • h2oGPT custom model
    note

    You can utilize one of your custom h2oGPT models for a text-generative AI annotation task. To learn more, see see Custom model URL.

  • OpenAI model (GPT-3.5 and GPT-4)
    note

    To utilize an OpenAI model, you need to define the OpenAI API settings linking to your OpenAI account. To learn more, see OpenAI API settings.

note

Select a particular model: To learn how to select a particular zero-shot learning model for a text-generative AI annotation task, see Select a zero-shot learning model.

Select a zero-shot learning model

You can only select the zero-shot learning model (large language model (LLM)) for a Text summarization or Text-generative AI annotation task.

caution

The below instructions assume you have already created a text-generative AI or text summarization annotation task. To learn how to create an annotation task, see Create an annotation task.

  1. In the H2O Label Genie navigation menu, click Annotation tasks.
  2. In the Annotation tasks table, click the name of the annotation task you want to define a zero-shot learning model for.
  3. Click the Rubric tab.
  4. Depending on the annotation task, consider the following instructions:
  1. In the Select model list, select a zero-shot learning model.
    note

    To learn about available models, see Text summarization

  2. Click Edit
  3. In the Max target length box, enter a value for the text summary token length.
  4. Click Apply.

Feedback