Prediction settings: Text classification
To score (predict) new data using a built model through the H2O Hydrogen Torch UI, H2O Hydrogen Torch requires the specification of certain settings refer as prediction settings (which are comprised of a certain dataset, prediction and environment settings similar to those when creating an experiment). Below observe the prediction settings for a text classification model.
General settings
Experiment
Defines the model (experiment) H2O Hydrogen Torch uses to score new data.
Prediction name
It defines the name of the prediction.
Dataset settings
Dataset
Specifies the dataset to use for scoring.
Test dataframe
Defines the file(s) containing the test dataframe that H2O Hydrogen Torch will use for scoring.
- Image regression | Image classification | Image metric learning | Text regression | Text classification | Text sequence to sequence | Text span prediction | Text token classification | Text metric learning | Audio regression | Audio classification
- Defines a
.csv
or.pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.noteThe test dataframe should have the same format as the train dataframe but does not require label columns.
- Defines a
- Image object detection | Image semantic segmentation | Image instance segmentation
- Defines a
.pq
file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
- Defines a
Text column
Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.
Prediction settings
Metric
Specifies the evaluation metric to use to evaluate the model's accuracy.
Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.
Probability threshold
- Image instance segmentation | Image semantic segmentation
- Defines the probability threshold; a predicted pixel will be treated as positive if its probability is larger than the probability threshold.
- Image object detection
- Defines the probability threshold that the model utilizes to identify predicted bounding boxes with confidence larger than the defined probability threshold. Predicted bounding boxes above the defined probability threshold are added to the validation and test
.csv
files in the downloaded model predictions.zip
file.
- Defines the probability threshold that the model utilizes to identify predicted bounding boxes with confidence larger than the defined probability threshold. Predicted bounding boxes above the defined probability threshold are added to the validation and test
- Audio classification | Image classification | Text classification
- Define a threshold for threshold-dependent classification metrics (e.g. F1). For multi-class classification argmax will be used.note
The defined threshold is used as a default threshold when displaying all other threshold-dependent metrics in the validation plots.
- Define a threshold for threshold-dependent classification metrics (e.g. F1). For multi-class classification argmax will be used.
Environment settings
GPUs
Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai