Skip to main content
Version: v1.2.0

Prediction settings: Text classification

To score (predict) new data using a built model through the H2O Hydrogen Torch UI, H2O Hydrogen Torch requires the specification of certain settings refer as prediction settings (which are comprised of a certain dataset, prediction and environment settings similar to those when creating an experiment). Below observe the prediction settings for a text classification model.

General settings

Experiment

Defines the model (experiment) H2O Hydrogen Torch uses to score new data.

Prediction name

It defines the name of the prediction.

Dataset settings

Dataset

Specifies the dataset to use for scoring.

Test dataframe

Defines the file(s) containing the test dataframe that H2O Hydrogen Torch will use for scoring.

note
  • Image regression | Image classification | Image metric learning | Text regression | Text classification | Text sequence to sequence | Text span prediction | Text token classification | Text metric learning | Audio regression | Audio classification
    • Defines a .csv or .pq file containing the test dataframe that H2O Hydrogen Torch will use for scoring.
      note

      The test dataframe should have the same format as the train dataframe but does not require label columns.

  • Image object detection | Image semantic segmentation | Image instance segmentation
    • Defines a .pq file containing the test dataframe that H2O Hydrogen Torch will use for scoring.

Text column

Defines the column name with the input text that H2O Hydrogen Torch will use during scoring.

Prediction settings

Metric

Specifies the evaluation metric to use to evaluate the model's accuracy.

note

Usually, the evaluation metric should reflect the quantitative way of assessing the model's value for the corresponding use case.

Probability threshold

  • Image instance segmentation | Image semantic segmentation
    • Defines the probability threshold; a predicted pixel will be treated as positive if its probability is larger than the probability threshold.
  • Image object detection
    • Defines the probability threshold that the model utilizes to identify predicted bounding boxes with confidence larger than the defined probability threshold. Predicted bounding boxes above the defined probability threshold are added to the validation and test .csv files in the downloaded model predictions .zip file.
  • Audio classification | Image classification | Text classification
    • Define a threshold for threshold-dependent classification metrics (e.g. F1). For multi-class classification argmax will be used.
      note

      The defined threshold is used as a default threshold when displaying all other threshold-dependent metrics in the validation plots.

Environment settings

GPUs

Specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1).


Feedback