Prediction settings: Speech recognition
Overview
To score (predict) new data through the H2O Hydrogen Torch UI (with a built model), you need to specify certain settings refer as prediction settings (which are comprised of certain dataset, prediction, and environment settings similar to those utilized when creating an experiment). Below observe the prediction settings for a speech recognition model.
General settings
Experiment
This setting defines the model (experiment) H2O Hydrogen Torch utilizes to score new data.
Prediction name
This setting defines the name of the prediction.
Dataset settings
Dataset
This setting specifies the dataset to score.
Test dataframe
This setting defines the file containing the test dataset that H2O Hydrogen Torch scores.
- Image regression | 3D image regression | Image classification | 3D image classification | Image metric learning | Text regression | Text classification | Text sequence to sequence | Text span prediction | Text token classification | Text metric learning | Audio regression | Audio classification | Graph node classification | Graph node regression
- Defines a CSV or Parquet file containing the test dataset that H2O Hydrogen Torch utilizes for scoring.
noteThe test dataset should have the same format as the train dataset but does not require label columns.
- Image object detection | Image semantic segmentation | 3D image semantic segmentation | Image instance segmentation
- Defines a Parquet file containing the test dataset that H2O Hydrogen Torch utilizes for scoring.
:::
- Defines a Parquet file containing the test dataset that H2O Hydrogen Torch utilizes for scoring.
Data folder test
Defines the folder location of the assets (for example, images or audios) H2O Hydrogen Torch utilizes for scoring. H2O Hydrogen Torch loads assets from this folder during scoring.
Audio column
Specifies the dataframe column storing the names of audios that H2O Hydrogen Torch loads from the Data folder test during scoring.
Prediction settings
Normalize text
Determines whether to normalize the label and prediction transcripts when scoring the experiment. This setting does not change the text a model utilizes for training. Before converting the text to lowercase, H2O Hydrogen Torch removes lead/trailing whitespaces.
Metric
This setting defines the evaluation metric in which H2O Hydrogen Torch evaluates the model's accuracy on generated predictions.
Batch Size Inference
This setting defines the batch size of examples to utilize for inference.
Selecting 0 will set the Batch size inference to the same value used for the Batch size setting (utilized during training).
Duration in visualizations
Defines the maximum audio duration (in seconds) H2O Hydrogen Torch utilizes for audio rendered in the visualizations page.
Setting the duration to high for long audio datasets may lead to the visualization page failing.
Suppress default tokens
Determines whether to suppress (not generate) certain tokens in the text generation process (speech to text).
- For pretrained Whisper models, these default tokens typically include non-speech predictions (for example, "[RADIO]", "[Laughter]", etc.) that are an artifact of the noisy pre-training data.
- Suppressing default tokens is not just a simple deletion of text but potentially alters the text generation process (speech-to-text). For example, a Whisper model suppressing its default tokens (certain tokens like non-words and punctuation (for example, "[RADIO]")) resulted in altered predictions.
Chunk time
Specifies the audio length (seconds) H2O Hydrogen Torch accepts for the experiment. H2O Hydrogen Torch splits audios longer than the specified audio length into chunks where the chunk length is based on the defined audio length.
- The text predictions are stitched back together in the final prediction
- Most models should be able to infer 60-second audio samples within 16GB VRAM (that is, the default Chunk time)
- Text predictions may vary as this setting varies due to the underlying transformer architecture seeing different amounts of context in a single chunk
Environment settings
GPUs
This setting specifies the list of GPUs H2O Hydrogen Torch can use for scoring. GPUs are listed by name, referring to their system ID (starting from 1). If no GPUs are selected, H2O Hydrogen Torch utilizes CPUs for model scoring.
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai