Create a clustering task
Overview
H2O Label Genie enables you to explore an image or text dataset through a clustering task. A clustering task refers to finding and exploring groups in a dataset.
note
To learn about supported clustering tasks, see Supported clustering tasks.
Instructions
To explore a dataset through a clustering task, consider the following instructions:
- On the H2O Label Genie navigation menu, click Data exploration.
- Click New clustering task.
- In the Task name box, enter a name for the clustering task.
- In the Task description box, enter a description for the clustering task.
- In the Select dataset list, select an image or text dataset (the dataset you want to explore).
- If the data type of the selected dataset is text, proceed with the following instructions:
- In the Select text column list, select the text column in the dataset (data).
- If the data type of the selected dataset is text, proceed with the following instructions:
- In the Number of clusters box, enter the number of clusters to be used by the clustering algorithm.
- In the Type list, select a clustering algorithm for the clustering task.note
H2O Label Genie supports Gaussian mixture and K-means clustering for image and text datasets. The clustering is performed on the data embeddings generated with the OpenCLIP learning model. OpenCLIP is an adaptation of OpenAI's Contrastive Language-Image Pre-training (CLIP). To learn more about OpenCLIP, see OpenCLIP.
- Click Start clustering.note
- Several tabs appear when viewing or right after creating a clustering task. To learn more, see Clustering task tabs.
- For a clustering task, you can download the cluster labels in the Gallery and the Map tab once clustering results are available. To lear more, see Download a clustering task's cluster labels.
- Several tabs appear when viewing or right after creating a clustering task. To learn more, see Clustering task tabs.
Feedback
- Submit and view feedback for this page
- Send feedback about H2O Label Genie to cloud-feedback@h2o.ai