Skip to main content
Version: Next

Prediction settings: Multi-modal causal language modeling

To score (predict) new data through the H2O Hydrogen Torch UI (with a built model), you need to specify certain settings refer as prediction settings (which are comprised of certain dataset, prediction, and environment settings similar to those utilized when creating an experiment). Below observe the prediction settings for a multi-modal causal language modeling model.

General settings

Experiment

This setting defines the model (experiment) H2O Hydrogen Torch utilizes to score new data.

Prediction name

This setting defines the name of the prediction.

Dataset settings

Dataset

This setting specifies the dataset to score.

Test dataframe

This setting defines the file containing the test dataset that H2O Hydrogen Torch scores.

note
  • Image regression | 3D image regression | Image classification | 3D image classification | Image metric learning | Text regression | Text classification | Text sequence to sequence | Text span prediction | Text token classification | Text metric learning | Audio regression | Audio classification | Graph node classification | Graph node regression
    • Defines a CSV or Parquet file containing the test dataset that H2O Hydrogen Torch utilizes for scoring.
    note

    The test dataset should have the same format as the train dataset but does not require label columns.

  • Image object detection | Image semantic segmentation | 3D image semantic segmentation | Image instance segmentation
    • Defines a Parquet file containing the test dataset that H2O Hydrogen Torch utilizes for scoring.
      :::

Prediction settings

Metric

This setting defines the evaluation metric in which H2O Hydrogen Torch evaluates the model's accuracy on generated predictions.

Max new tokens

This setting defines the maximum number of new tokens that can be generated in the output text.

Do sample

Determines whether to sample from the next token distribution instead of choosing the token with the highest probability. If turned On, the next token in a predicted sequence is sampled based on the probabilities. If turned Off, the highest probability is always chosen.

Environment settings

GPUs

This setting specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1). If no GPUs are selected, H2O Hydrogen Torch utilizes CPUs for model scoring.


Feedback