Download an annotated dataset (approved samples)
Overviewβ
At any point in an annotation task, you can download the already approved (annotated samples). You do not need to fully annotate an imported dataset to download already annotated samples (approved samples). H2O Label Genie downloads the annotated dataset (approved samples) in a format H2O Hydrogen Torch supports.
Instructionsβ
To download an annotated dataset (approves samples), consider the following instructions:
- On the H2O Label Genie navigation menu, click Annotation tasks.
- In the annotation tasks table, double-click the row where the annotation task you want to download is located.
- Click the Export tab.
- In the Export approved samples list, select Download ZIP.
Note
H2O Label Genie downloads a zip file containing the annotated dataset in a format aligning with the dataset's problem type (annotation task type) and to be supported in H2O Hydrogen Torch. To learn more, see Downloaded dataset formats.
- H2O Label Genie generates zero-shot predictions for specific annotation tasks that can be downloaded or exported to H2O Drive.
- To learn how to download a dataset's zero-shot predictions, see Download a dataset's zero-shot predictions
- To learn how to export a dataset's zero-shot predictions to H2O Drive, see Export a dataset's zero-shot predictions to H2O Drive
- You can export an annotated dataset (approved samples) to H2O Drive. To learn more, see Export an annotated dataset (approved samples) to H2O Drive.
Downloaded dataset formatsβ
Text classificationβ
A downloaded text classification dataset (with approved samples) follows the following dataset format: A zip file (1) containing a CSV file (2):
folder_name.zip (1)
β ββββcsv_name.csv (2)
- The available data connectors in H2O Hydrogen Torch require your data for a text classification experiment to be either in a single CSV file or zip file for a successful import (upload).
- A CSV file containing a text and label column
- text: The text column contains the text input
- label: The label column contains the labels attributed to the texts specified in the text column
Text regressionβ
A downloaded text regression dataset (with approved samples) follows the following dataset format: A zip file (1) containing a CSV file (2):
folder_name.zip (1)
β ββββcsv_name.csv (2)
- The available data connectors in H2O Hydrogen Torch require your data for a text regression experiment to be either in a single CSV file or zip file for a successful import (upload).
- A CSV file containing a text and label column
- text: The text column contains the text input
- label: The label column contains the labels attributed to the texts specified in the text column
Text-entity recognitionβ
A downloaded text entity recognition dataset (with approved samples) follows the following dataset format: A zip file (1) containing a .pq
file (2):
folder_name.zip (1)
β ββββpq_name.pq (2)
- The available data connectors in H2O Hydrogen Torch require your data for a text-entity experiment to be either in a single CSV file or zip file for a successful import (upload).
- A
.pq
file containing a text and label column- text: The text column contains the text input
- label: The label column contains the labels attributed to the text-entities specified in the text column
Text summarizationβ
A downloaded text summarization dataset (with approved samples) follows the following dataset format: A zip file (1) containing a CSV file (2):
folder_name.zip (1)
β ββββcsv_name.csv (2)
- The available data connectors in H2O Hydrogen Torch require your data for a text summarization experiment to be either in a single CSV file or zip file for a successful import (upload).
- A CSV file containing a text and label column
- text: The text column contains the text input
- label: The label column contains the summaries attributed to the texts specified in the text column