Zero-shot learning models
Overview
H2O Label Genie enables you to utilize zero-shot learning models to accelerate the labeling process. In particular, H2O Label Genie lets you use a zero-shot learning model for the following supported annotation tasks:
- Image classification
- Object detection
- Image instance segmentation
- Text classification
- Text summarization
- Text-generative AI
What are zero-shot learning models?
Labeled data is crucial for supervised learning problem types in computer vision (CV), natural language processing (NLP), and audio. High-quality labeled data usually requires a lot of manual labeling that can lead to high costs and delay production or execution.
One way to accelerate the labeling process is to utilize zero-shot learning models. These models let data scientists label unlabeled data with high accuracy and speed. Zero-shot learning models are pre-trained models that have been trained on vast and distinct classes. As a result, zero-shot learning models with prior knowledge can label unlabeled data.
What are zero-shot learning model predictions?
The labels or suggested labels for a given sample that are provided by a zero-shot learning model are called zero-shot learning model predictions. For example, for an image and text classification annotation task, H2O Label Genie, with a zero-shot learning model activated, offers a percentage probability of an image or text belonging to a certain label (class). For an object detection annotation task, it populates an image with bounding boxes where the bounding boxes capture the desired objects (for example, a car).
Annotation tasks + zero-shot learning models
Image classification
By default, H2O Label Genie utilizes the OpenCLIP zero-shot learning model for image classification annotation tasks. OpenCLIP is an adaptation of OpenAI's Contrastive Language-Image Pre-training (CLIP).
- To learn more, see Open-clip
Object detection
By default, H2O Label Genie utilizes the Detic zero-shot learning model for object detection annotation tasks.
- To learn more, see Detecting Twenty-thousand Classes using Image-level Supervision
Image instance segmentation
By default, H2O Label Genie utilizes the Detic zero-shot learning model for image instance segmentation annotation tasks.
- To learn more, see Detecting Twenty-thousand Classes using Image-level Supervision
Text classification
By default, H2O Label Genie utilizes the bart-large-mnli zero-shot learning model for text classification annotation tasks.
- To learn more, see Bart-large-mnli
Text summarization
H2O Label Genie allows you to utilize the following zero-shot learning models for text summarization annotation tasks:
- Bart-large-cnn
- To learn more, see Bart-large-cnn
- Distilbart-cnn-12-6
- To learn more, see Distilbart-cnn-12-6
- Pegasus-large
- To learn more, see Pegasus-large
- h2oGPT
- To learn more, see h2oGPT Llama2 70B
- OpenAI GPT-3.5
- To learn more, see GPT-3.5
- OpenAI GPT-4
- To learn more, see GPT-4
- OpenAI models: To utilize an OpenAI model, you need to define the OpenAI API settings linking to your OpenAI account. To learn more, see OpenAI API settings
- Select a particular model: To learn how to select a particular zero-shot learning model for a text summarization annotation task, see Select a zero-shot learning model
Text-generative AI
H2O Label Genie allows you to utilize the following zero-shot learning models from the h2oGPT family for text-generative AI annotation tasks:
- h2oGPT Llama2 7B
- To learn more about the model, see h2oGPT Llama2 7B
- h2oGPT Llama2 13B
- To learn more about the model, see h2oGPT Llama2 13B
- h2oGPT Llama2 70B
- To learn more about the model, see h2oGPT Llama2 70B
- h2oGPT custom modelnote
You can utilize one of your custom h2oGPT models for a text-generative AI annotation task. To learn more, see see Custom model URL.
- OpenAI model (GPT-3.5 and GPT-4) note
To utilize an OpenAI model, you need to define the OpenAI API settings linking to your OpenAI account. To learn more, see OpenAI API settings.
Select a particular model: To learn how to select a particular zero-shot learning model for a text-generative AI annotation task, see Select a zero-shot learning model.
Select a zero-shot learning model
You can only select the zero-shot learning model (large language model (LLM)) for a Text summarization or Text-generative AI annotation task.
The below instructions assume you have already created a text-generative AI or text summarization annotation task. To learn how to create an annotation task, see Create an annotation task.
- In the H2O Label Genie navigation menu, click Annotation tasks.
- In the Annotation tasks table, click the name of the annotation task you want to define a zero-shot learning model for.
- Click the Rubric tab.
- Depending on the annotation task, consider the following instructions:
- Text summarization
- Text-generative AI
- In the Select model list, select a zero-shot learning model. note
To learn about available models, see Text summarization
- Click Edit
- In the Max target length box, enter a value for the text summary token length.
- Click Apply.
- In the Select model family list, select a large language model (LLM) family.
- Click Edit.
- In the Model name for large language model list, select (or enter) a model name for the annotation task.
- In the Max response tokens box, enter the maximum number of tokens for a response.
- In the Temperature box, enter a temperature value.
- In the Repetition penalty box, enter the penalty value for tokens frequently reappearing in the text (response).
To learn about the LLM parameter settings, see Large language model (LLM) parameters.
- Submit and view feedback for this page
- Send feedback about H2O Label Genie to cloud-feedback@h2o.ai