Experiment settings: Image semantic segmentation
The settings for an image semantic segmentation experiment are listed and described below.
General settings
Dataset
It defines the dataset for the experiment.
Problem category
This setting defines a particular general problem type category, for example, image.
- The selected problem category (for example, image) determines the options in the Problem type setting.
- The following option is available when defining the settings of an experiment: From experiment.
- The From experiment option enables you to utilize the settings of an experiment (another experiment).
Experiment
Defines the experiment H2O Hydrogen Torch references to initialize the experiment settings. H2O Hydrogen Torch initializes the experiment settings with the values from the selected (built) experiment.
This setting is available only if From experiment is selected in the Problem category setting.
Problem type
Defines the problem type of the experiment, which also defines the settings H2O Hydrogen Torch displays for the experiment.
- The selected problem category (in the Problem category setting) determines the available problem types.
- The selected problem type and experience level determine the settings H2O Hydrogen Torch displays for the experiment.
Import config from YAML
Defines the .yml
file that defines the experiment settings.
- H2O Hydrogen Torch supports a
.yml
file import and export functionality. You can download the config settings of finished experiments, make changes, and re-upload them when starting a new experiment in any instance of H2O Hydrogen Torch.- To learn how to download the
.yml
file (configuration file) of a completed experiment, see Download an experiment's logs/config file.
- To learn how to download the
Use previous experiment weights
Defines whether to initialize the model weights with the weights from the experiment specified in the Experiment setting.
- This setting is available only if From experiment is selected in the Problem category setting
- A model's weights are available for an experiment (model) of the same problem type and backbone.
- This setting might be useful in case you want to continue training from a built experiment
Experiment name
It defines the name of the experiment.
Dataset settings
Train dataframe
Defines a .csv
or .pq
file containing a dataframe with training records that H2O Hydrogen Torch uses to train the model.
- The records are combined into mini-batches when training the model.
- If a validation dataframe is provided, a fold column is not needed in the train dataframe.
- You can now import datasets for inference only. To do so, when defining the setting for an experiment, set the Train dataframe setting to None while setting the Test dataframe setting to the relevant dataframe (as a result, H2O Hydrogen Torch utilizes the relevant dataset for predictions and not for training).
Data folder
Defines the location of the folder containing assets (for example, images or audio clips) the model utilizes for training. H2O Hydrogen Torch loads assets from this folder during training.
Validation strategy
Specifies the validation strategy H2O Hydrogen Torch uses for the experiment.
To properly assess the performance of your trained models, it is common practice to evaluate it on separate holdout data that the model has not seen during training. H2O Hydrogen Torch allows you to specify different strategies for this task fitting your needs.
Details
Options
- All supported problem types
- K-fold cross validation
- Splits the data using the provided optional fold column in the train data or performs an automatic 5-fold cross-validation.
- Grouped k-fold cross validation
- Allows to specify a group column based on which the data is split into folds.
- Custom holdout validation
- Specifies a separate holdout dataframe.
- Automatic holdout validation
- Allows to specify a holdout validation sample size that is automatically generated.
- K-fold cross validation
Validation dataframe
Defines a .csv
or .pq
file containing a dataframe with validation records that H2O Hydrogen Torch uses to evaluate the model during training.
- To set a Validation dataframe requires the Validation strategy to be set to Custom holdout validation. In this case, H2O Hydrogen Torch fully respects the choice of a separate validation dataframe and does not perform any internal cross-validation. In other words, the model is trained on the full provided train dataframe, and model performance is evaluated on the provided validation dataframe.
- The validation dataframe should have the same format as the train dataframe but does not require a fold column.
Selected folds
Defines the selected validation fold(s) in case of cross-validation; a separate model is trained for each value selected. Each model utilizes the corresponding part of the data as a holdout sample to assess performance while the model is fitted to the rest of the records from the training dataframe. As a result, folds estimate how the model performs in general when used to make predictions on data not used during model training.
- H2O Hydrogen Torch allows running experiments on a single selected fold for faster experimenting and multiple selected folds to gain more trust in the model's generalization and performance capabilities.
- The Selected folds setting is only be available if Custom holdout validation is not selected as the Validation strategy.
Test dataframe
Defines a .csv
or .pq
file containing a dataframe with test records that H2O Hydrogen Torch uses to test the model.
- The test dataframe should have the same format as the train dataframe but does not require a label column.
- You can now import datasets for inference only. To do so, when defining the setting for an experiment, set the Train dataframe setting to None while setting the Test dataframe setting to the relevant dataframe (as a result, H2O Hydrogen Torch utilizes the relevant dataset for predictions and not for training).
Data folder test
Defines the location of the folder containing assets (for example, images, texts, or audio clips) H2O Hydrogen Torch utilizes to test the model. H2O Hydrogen Torch loads the assets from this folder when testing the model. This setting is only available if a test dataframe is selected.
The Data folder test setting appears when you specify a test dataframe in the Test dataframe setting.
Class name column
Defines the dataset column containing a list of class names that H2O Hydrogen Torch uses for each instance mask.
Rle mask column
Defines the dataset column containing a list of run-length encoded (RLE) masks that H2O Hydrogen Torch uses for instance class.
Image column
Defines the dataframe column storing the names of images that H2O Hydrogen Torch loads from the data folder and data folder test when training and testing the model.
Data sample
Defines the percentage of the data to use for the experiment. The default percentage is 100% (1).
Changing the default value can significantly increase the training speed. Still, it might lead to a substantially poor accuracy value. Using 100% (1) of the data for final models is highly recommended.
Image settings
Image width
Defines the width H2O Hydrogen Torch uses to rescale the images for training and predictions.
Depending on the original image size, a bigger width can generate a higher accuracy value.
Image height
Defines the width H2O Hydrogen Torch uses to rescale the images for training and predictions.
Depending on the original image size, a bigger width can generate a higher accuracy value.
Image channels
Defines the number of channels the train images contain.
- Typically images have three input channels (red, green, and blue (RGB)), but grayscale images have only 1. When you provide image data in a NumPy data format, any number of channels is allowed. For this reason, data scientists can specify the number of channels.
- The defined number of channels also refers to the provided validation and test datasets.
Image normalization
Grid search hyperparameter
Defines the transformer to normalize the image data before training the model.
Usually, state-of-the-art image models normalize the training images by scaling values of each of the input channels to predefined means and standard deviations.
Details
Options
Details
Image regression | Image classification | Image object detection | Image semantic segmentation | Image instance segmentation | Image metric learning
- Channel
- Calculates mean and standard deviation per channel in all the images in the batch and then applies per channel normalization: subtracts mean and divides by standard deviation.
- Image
- Calculates mean and standard deviation per image and then applies normalization.
- ImageNet
- Divides input images by 255 and normalizes with mean and standard deviation equal to (0.485, 0.456, 0.406) and (0.229, 0.224, 0.225) per channel, respectively.
- Inception
- Divides input images by 255 and normalizes with mean and standard deviation equal to 0.5.
- Min_Max
- Calculates minimum and maximum values in all the images in the batch and then applies min-max normalization: subtracts min and divides by the max and min difference.
- No
- No normalization is applied to the input images.
- Simple
- Divides input images by 255.
Details
3D image classification | 3D image regression | 3D image semantic segmentation
- No
- No normalization is applied to the input images.
- Simple
- Divides input images by 255.
- Min_Max
- Calculates minimum and maximum values in all the images in the batch and then applies min-max normalization: subtracts min and divides by the max and min difference.
Usually, state-of-the-art image models normalize the training images by scaling values of each of the input channels to predefined means and standard deviations.
Augmentation settings
Augmentations strategy
Grid search hyperparameter
Defines the augmentation strategy to apply to the input images. Soft, Medium, and Hard values correspond to the strength of the augmentations to apply.
Details
Options
Details
Image regression | Image classification | Image object detection | Image semantic segmentation | Image instance segmentation | Image metric learning
- Soft
- The Soft strategy applies image Resize and random HorizontalFlip during model training while applying image Resize during model inference.
- Medium
- The Medium strategy adds ShiftScaleRotate and CoarseDropout to the list of the train augmentations.
- Hard
- The Hard strategy applies RandomResizedCrop (instead of Resize) during model training while adding RandomBrightnessContrast to the list of train augmentations.
- Custom
- The Custom strategy allows users to use their own augmentations that can be defined in the following two settings:
Details
3D image classification | 3D image regression | 3D image semantic segmentation
- Soft
- The Soft strategy applies image Resize and random HorizontalFlip during model training while applying image Resize during model inference.
- Medium
- The Medium strategy adds ShiftScaleRotate and CoarseDropout to the list of the train augmentations.
- Hard
- The Hard strategy applies RandomResizedCrop (instead of Resize) during model training while adding RandomBrightnessContrast to the list of train augmentations.
Augmentations are ways to modify train images while keeping the target values valid, such as flipping the image or adding noise. Distorting training images do not influence the expected prediction of the model but enrich the training data. Augmentations help generalize the model better and improve its accuracy.
Mix image
Grid search hyperparameter
Defines the image mix augmentation to use during model training.
Details
Options
Details
Image regression | Image classification | Image object detection | Image semantic segmentation | Image instance segmentation
- Mixup
- Mixup overlays (mixes) two images one on another based on a random ratio. To learn more about this mix augmentation approach, refer to the following article: mixup: BEYOND EMPIRICAL RISK MINIMIZATION.
noteFor an image object detection experiment using Mixup, H2O Hydrogen Torch uses the union of all the target boxes in mixed images.
- Cutmix
- Cutmix replaces an image region with a patch from another image; the region size is based on a random ratio. To learn more about this mix augmentation approach, refer to the following article: SOLVING LINEAR SYSTEMS OVER TROPICAL SEMIRINGS THROUGH NORMALIZATION METHOD AND ITS APPLICATIONS.
noteFor an image object detection experiment using Cutmix, H2O Hydrogen Torch uses the target boxes from the corresponding region from each image. Also, with Cutmix selected, H2O Hydrogen Torch cuts out and replaces only the corners of the images with a patch from another image.
- Disabled
- No augmentation is applied.
Details
3D image classification | 3D image regression | 3D image semantic segmentation
- Mixup
- Mixup overlays (mixes) two images one on another based on a random ratio. To learn more about this mix augmentation approach, refer to the following article: mixup: BEYOND EMPIRICAL RISK MINIMIZATION.
noteFor an image object detection experiment using Mixup, H2O Hydrogen Torch uses the union of all the target boxes in mixed images.
- Disabled
- No augmentation is applied.
Architecture settings
Pretrained
Grid search hyperparameter
Defines whether the neural network should start with pre-trained weights. When this setting is On, the training of the neural network starts with a pre-trained model on a generic task. When turned Off, the initial weights of the neural network to train become random.
Backbone
Grid search hyperparameter
Defines the backbone neural network architecture to train the model.
- Image regression | Image classification | Image metric learning | Audio regression | Audio classification
- H2O Hydrogen Torch accepts backbone neural network architectures from the timm library (select or enter the architecture name)
- Image object detection
- H2O Hydrogen Torch provides several backbone state-of-the-art neural network architectures for model training. When you select Faster RCnn or Fcos as the model type for the experiment, you can input any architecture name from the timm library. When you select Efficientdet as the model type for the experiment, you can input any architecture name from the efficientdet-pytorch library
- Image semantic segmentation | Image instance segmentation
- H2O Hydrogen Torch accepts backbone neural network architectures from the segmentation-models-pytorch library (select or enter the architecture name).
- 3D image regression | 3D image classification
- H2O Hydrogen Torch accepts backbone (encoder) neural network architectures from a subset (resnet and efficientnet) of the timm library (select or enter the architecture name).
- Text regression | Text classification | Text token classification | Text span prediction | Text sequence to sequence | Text metric learning
- H2O Hydrogen Torch accepts backbone neural network architectures from the Hugging Face library (select or enter the architecture name)
- Speech recognition
- HuggingFace Wav2Vec2 CTC models are supported
- All problem types
- Usually, it is good to use simpler architectures for quicker experiments and larger models when aiming for the highest accuracy
- Speech recognition
- If possible, leverage backbones pre-trained closely to your use case (for example, noisy audio, casual speech, etc.)
Architecture
Grid search hyperparameter
Defines the architecture (decoder) to use for the experiment. H2O Hydrogen Torch uses semantic segmentation architectures with additional postprocessing to separate masks into individual instances.
Details
Options
Details
Image semantic segmentation | Image instance segmentation
- DeeplabV3
- To learn about the DeeplabV3 architecture, see Rethinking Atrous Convolution for Semantic Image Segmentation.
- DeeplabV3+
- To learn about the DeeplabV3+ architecture, see Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.
- FPN
- To learn about the FPN architecture, see A Unified Architecture for Instance and Semantic Segmentation.
- Linknet
- To learn about the Linknet architecture, see LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation.
- PAN
- To learn about the PAN architecture, see Pyramid Attention Network for Semantic Segmentation.
- PSPNet
- To learn about the PSPNet architecture, see Pyramid Scene Parsing Network.
- Unet
- To learn about the Unet architecture, see U-Net: Convolutional Networks for Biomedical Image Segmentation.
- Unet++
- To learn about the Unet architecture, see UNet++: A Nested U-Net Architecture for Medical Image Segmentation.
Training settings
Loss function
Grid search hyperparameter
Defines the loss function H2O Hydrogen Torch utilizes during model training. The loss function is a differentiable function measuring the prediction error. The model utilizes gradients of the loss function to update the model weights during training.
Details
Options
Details
Image regression | 3D image regression | Text regression | Audio regression
- MAE
- H2O Hydrogen Torch utilizes the mean absolute error (L1 norm) as the loss function.
- MSE
- H2O Hydrogen Torch utilizes the mean squared error (squared L2 norm) as the loss function.
- RMSE
- H2O Hydrogen Torch utilizes the mean squared error (L2 norm) as a loss function.
Details
Image classification | 3D image classification | Text classification | Audio classification
- BCE
- H2O Hydrogen Torch uses binary cross entropy loss.
- Classification
- This default classification loss automatically chooses between BCE (multi-label) and CrossEntropy (multi-class) for classification.
- CrossEntropy
- H2O Hydrogen Torch utilizes multi-class cross entropy loss as a loss function.
- SigmoidFocal
- H2O Hydrogen Torch uses the sigmoid Focal loss (gamma=2.0) for classification introduced in the following paper: Focal Loss for Dense Object Detection
- SoftmaxFocal
- H2O Hydrogen Torch uses the softmax Focal loss (gamma=2.0) for classification introduced in the following paper: Focal Loss for Dense Object Detection
Details
Image semantic segmentation | 3D image semantic segmentation | Image instance segmentation
- BCE
- H2O Hydrogen Torch uses binary cross entropy loss.
- BCEDice
- H2O Hydrogen Torch uses binary cross entropy loss and Dice loss weights 2 and 1, respectively.
- BCELovasz
- H2O Hydrogen Torch uses binary cross entropy loss and Lovasz loss with equal weights.
- Dice
- H2O Hydrogen Torch uses Dice loss.
- Focal
- H2O Hydrogen Torch uses the Focal loss for semantic segmentation introduced in the following paper: Focal Loss for Dense Object Detection
- FocalDice
- H2O Hydrogen Torch uses Focal loss and Dice loss with weights 2 and 1, respectively.
- Jaccard
- H2O Hydrogen Torch uses Jaccard loss.
Details
Image metric learning | Text metric learning
- ArcFace
- H2O Hydrogen Torch utilizes an Additive Angular Margin Loss for Deep Face Recognition (ArcFace).
- CrossEntropy
- H2O Hydrogen Torch utilizes multi-class cross entropy loss as a loss function.
Details
Text token classification | Text span prediction | Text sequence to sequence
- CrossEntropy
- H2O Hydrogen Torch utilizes multi-class cross entropy loss as a loss function.
Details
Speech recognition
- CTC Loss
- H2O Hydrogen Torch utilizes Conectionist Temporal Classification loss as a loss function.
BCE weight
Grid search hyperparameter
The BCE weight is available for certain Loss functions. Accordingly, its definition is determined by the selected Loss function.
- BCEDice
- If the selected Loss function is BCEDice, the BCE weight takes the following defitniton:
BCEDice loss = BCE loss * BCE weight + Dice loss * Dice weight
- If the selected Loss function is BCEDice, the BCE weight takes the following defitniton:
- BCELovasz
- If the selected Loss function is BCELovasz, the BCE weight takes the following defitniton:
BCELovasz loss = BCE loss * BCE weight + Lovasz loss * Lovasz weight
- If the selected Loss function is BCELovasz, the BCE weight takes the following defitniton:
Dice weight
Grid search hyperparameter
The Dice weight is available for certain Loss functions. Accordingly, its definition is determined by the selected Loss function.
- BCEDice
- If the selected Loss function is BCEDice, the Dice weight takes the following defitniton:
BCEDice loss = BCE loss * BCE weight + Dice loss * Dice weight
- If the selected Loss function is BCEDice, the Dice weight takes the following defitniton:
- FocalDice
- If the selected Loss function is FocalDice, the Dice weight take the following definition:
FocalDice loss = Focal loss * Focal weight + Dice loss * Dice weight
- If the selected Loss function is FocalDice, the Dice weight take the following definition:
Lovasz weight
Grid search hyperparameter
The Lovasz weight is available for certain Loss functions. Accordingly, its definition is determined by the selected Loss function.
- BCELovasz
- If the selected Loss function is BCELovasz, the Lovasz weight takes the following defitniton:
BCELovasz loss = BCE loss * BCE weight + Lovasz loss * Lovasz weight
- If the selected Loss function is BCELovasz, the Lovasz weight takes the following defitniton:
Focal weight
Grid search hyperparameter
The Focal weight is available for certain Loss functions. Accordingly, its definition is determined by the selected Loss function.
- FocalDice
- If the selected Loss function is FocalDice, the Focal weight takes the following defitniton:
FocalDice loss = Focal loss * Focal weight + Dice loss * Dice weight
- If the selected Loss function is FocalDice, the Focal weight takes the following defitniton:
Optimizer
Grid search hyperparameter
Defines the algorithm or method (optimizer) to use for model training. The selected algorithm or method defines how the model should change the attributes of the neural network, such as weights and learning rate. Optimizers solve optimization problems and make more accurate updates to attributes to reduce learning losses.
Details
Options
- All supported problem types
- Adadelta
- To learn about Adadelta, see ADADELTA: An Adaptive Learning Rate Method.
- Adam
- To learn about Adam, see Adam: A Method for Stochastic Optimization.
- AdamW
- To learn about AdamW, see Decoupled Weight Decay Regularization.
- RMSprop
- To learn about RMSprop, see Neural Networks for Machine Learning.
- SGD
- H2O Hydrogen Torch uses a stochastic gradient descent optimizer.
- Adadelta
Learning rate
Grid search hyperparameter
Defines the learning rate H2O Hydrogen Torch uses when training the model, specifically when updating the neural network's weights. The learning rate is the speed at which the model updates its weights after processing each mini-batch of data.
- Learning rate is an important setting to tune as it balances under- and overfitting.
- The number of epochs highly impacts the optimal value of the learning rate.
Differential learning rate layers
Defines the learning rate to apply to certain layers of a model. H2O Hydrogen Torch applies the regular learning rate to layers without a specified learning rate.
Details
Options
Details
Image regression | Image classification | Text regression | Text classification | Text token classification | Audio regression | Audio classification
- Backbone
- H2O Hydrogen Torch applies a different learning rate to a body of the neural network architecture.
- Head
- H2O Hydrogen Torch applies a different learning rate to a head of the neural network architecture.
Details
Image object detection
The options for an image object detection experiment are different based on the selected Model type (setting). Options:
-
If you select EfficientDet as the experiment's Model type (setting), the following options are available:
Details
Options
- Backbone
- H2O Hydrogen Torch applies a different learning rate to a body of the EfficientDet architecture.
- FPN
- H2O Hydrogen Torch applies a different learning rate to a Feature Pyramid Network (FPN) block of the EfficientDet architecture.
- class_net
- H2O Hydrogen Torch applies a different learning rate to a classification head of the EfficientDet architecture.
- box_net
- H2O Hydrogen Torch applies a different learning rate to a box regression head of the EfficientDet architecture.
- Backbone
-
If you select Faster R-CNN as the experiment's Model type (setting), the following options are available:
Details
Options
- Body
- H2O Hydrogen Torch applies a different learning rate to a body of the Faster R-CNN architecture.
- FPN
- H2O Hydrogen Torch applies a different learning rate to a Feature Pyramid Network (FPN) block in the Faster R-CNN architecture.
- RPN
- H2O Hydrogen Torch applies a different learning rate to a Region Proposal block of the Faster R-CNN architecture.
- ROI heads
- H2O Hydrogen Torch applies a different learning rate to the Faster R-CNN architecture proposal heads.
- Body
-
If you select FCOS as the experiment's Model type (setting), the following options are available:
Details
Options
- Body
- H2O Hydrogen Torch applies a different learning rate to a body of the FCOS architecture.
- FPN
- H2O Hydrogen Torch applies a different learning rate to a Feature Pyramid Network (FPN) block of the FCOS architecture.
- classification_head
- H2O Hydrogen Torch applies a different learning rate to the classification head of the FCOS architecture.
- regression_head
- H2O Hydrogen Torch applies a different learning rate to a box regression head of the FCOS architecture.
- Body
Details
Image semantic segmentation
- Encoder
- H2O Hydrogen Torch applies a different learning rate to the encoder of the neural network architecture.
- Decoder
- H2O Hydrogen Torch applies a different learning rate to the decoder of the neural network architecture.
- Segmentation head
- H2O Hydrogen Torch applies a different learning rate to the head of the neural network architecture.
Details
3D image semantic segmentation | Text sequence to sequence
- Encoder
- H2O Hydrogen Torch applies a different learning rate to the encoder of the neural network architecture.
- Decoder
- H2O Hydrogen Torch applies a different learning rate to the decoder of the neural network architecture.
Details
Image instance segmentation
- Encoder
- H2O Hydrogen Torch applies a different learning rate to the encoder of the neural network architecture.
- Decoder
- H2O Hydrogen Torch applies a different learning rate to the decoder of the neural network architecture.
- Segmentation head
- H2O Hydrogen Torch applies a different learning rate to the head of the neural network architecture.
Details
Image metric learning | Text metric learning
- Backbone
- H2O Hydrogen Torch applies a different learning rate to a body of the neural network architecture.
- Neck
- H2O Hydrogen Torch applies a different learning rate to a neck of the neural network architecture.
- Loss
- H2O Hydrogen Torch applies a different learning rate to an ArcFace block of the neural network architecture.
Details
Text regression
- Backbone
- H2O Hydrogen Torch applies a different learning rate to a body of the neural network architecture.
Details
Text span prediction
- qa_outputs
A common strategy is to apply a lower learning rate to the backbone of a model for better convergence and training stability.
Different layers are available for different problem types.
Batch size
Grid search hyperparameter
Defines the number of training examples a mini-batch uses during an iteration of the training model to estimate the error gradient before updating the model weights. Batch size defines the batch size used per a single GPU.
During model training, the training data is packed into mini-batches of a fixed size.
Automatically adjust batch size
If this setting is turned On, H2O Hydrogen Torch checks whether the Batch size specified fits into the GPU memory. If a GPU out-of-memory (OOM) error occurs, H2O Hydrogen Torch automatically decreases the Batch size by a factor of 2 units until it fits into the GPU memory or Batch size equals 1.
Drop last batch
H2O Hydrogen Torch drops the last incomplete batch during model training when this setting is turned On.
H2O Hydrogen Torch groups the train data into mini-batches of equal size during the training process, but the last batch can have fewer records than the others. Not dropping the last batch can lead to a less robust gradient estimation while causing a more volatile training step.
Epochs
Grid search hyperparameter
Defines the number of epochs to train the model. In other words, it specifies the number of times the learning algorithm goes through the entire training dataset.
- The Epochs setting is an important setting to tune because it balances under- and overfitting.
- The learning rate highly impacts the optimal value of the epochs.
- For the following supported problem types, H2O Hydrogen Torch now enables you to utilize/deploy a pre-trained model trained on zero epochs (where H2O Hydrogen Torch does not train the model and the pretrained model (experiment) can be deployed as-is):
- Speech recognition
- Text sequence to sequence
- text span prediction
Schedule
Grid search hyperparameter
Defines the learning rate schedule H2O Hydrogen Torch utilizes during model training. Specifying a learning rate schedule prevents the learning rate from staying the same. Instead, a learning rate schedule causes the learning rate to change over iterations, typically decreasing the learning rate to achieve a better model performance and training convergence.
Details
Options
- All supported problem types
- Constant
- H2O Hydrogen Torch applies a constant learning rate during the training process.
- Cosine
- H2O Hydrogen Torch applies a cosine learning rate that follows the values of the cosine function.
- Linear
- H2O Hydrogen Torch applies a linear learning rate that decreases the learning rate linearly.
- Constant