Image configuration
enable_tensorflow_image
Enable Image Transformer for processing of image data (String) (Expert Setting)
Default value 'auto'
Whether to use pretrained deep learning models for processing of image data as part of the feature engineering pipeline. A column of URIs to images (jpg, png, etc.) will be converted to a numeric representation using ImageNet-pretrained deep learning models. If no GPUs are found, then must be set to ‘on’ to enable.
tensorflow_image_pretrained_models
Supported ImageNet pretrained architectures for Image Transformer (List) (Expert Setting)
Default value ['xception']
Supported ImageNet pretrained architectures for Image Transformer. Non-default ones will require internet access to download pretrained models from H2O S3 buckets (To get all models, download http://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/pretrained/dai_image_models_1_11.zip and unzip inside tensorflow_image_pretrained_models_dir).
tensorflow_image_vectorization_output_dimension
Dimensionality of feature space created by Image Transformer (List) (Expert Setting)
Default value [100]
Dimensionality of feature (embedding) space created by Image Transformer. If more than one is selected, multiple transformers can be active at the same time.
tensorflow_image_fine_tune
Enable fine-tuning of pretrained models used for Image Transformer (Boolean) (Expert Setting)
Default value False
Enable fine-tuning of the ImageNet pretrained models used for the Image Transformer. Enabling this will slow down training, but should increase accuracy.
tensorflow_image_fine_tuning_num_epochs
Number of epochs for fine-tuning used for Image Transformer (Number) (Expert Setting)
Default value 2
Number of epochs for fine-tuning of ImageNet pretrained models used for the Image Transformer.
tensorflow_image_augmentations
List of augmentations for fine-tuning used for Image Transformer (List) (Expert Setting)
Default value ['HorizontalFlip']
The list of possible image augmentations to apply while fine-tuning the ImageNet pretrained models used for the Image Transformer. Details about individual augmentations could be found here: https://albumentations.ai/docs/.
tensorflow_image_batch_size
Batch size for Image Transformer. Automatic: -1 (Number) (Expert Setting)
Default value -1
Batch size for Image Transformer. Larger architectures and larger batch sizes will use more memory.
tensorflow_image_pretrained_models_dir
Path to pretrained Image models. It is used to load the pretrained models if there is no Internet access. (String) (Expert Setting)
Default value './pretrained/image/'
Path to pretrained Image models. To get all models, download http://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/pretrained/dai_image_models_1_11.zip, then extract it in a directory on the instance where Driverless AI is installed.
image_download_timeout
Image download timeout in seconds (Number) (Expert Setting)
Default value 60
Max. number of seconds to wait for image download if images are provided by URL
string_col_as_image_max_missing_fraction
Max allowed fraction of missing values for image column (Float) (Expert Setting)
Default value 0.1
Maximum fraction of missing elements in a string column for it to be considered as possible image paths (URIs)
string_col_as_image_min_valid_types_fraction
Min. fraction of images that need to be of valid types for image column to be used (Float) (Expert Setting)
Default value 0.8
Fraction of (unique) image URIs that need to have valid endings (as defined by string_col_as_image_valid_types) for a string column to be considered as image data
tensorflow_image_use_gpu
Enable GPU(s) for faster transformations of Image Transformer. (Boolean) (Expert Setting)
Default value True
Whether to use GPU(s), if available, to transform images into embeddings with Image Transformer. Can lead to significant speedups.
params_image_auto_search_space
Search parameter overrides for image auto (Dict) (Expert Setting)
Default value {}
Nominally, the time dial controls the search space, with higher time trying more options, but any keys present in this dictionary will override the automatic choices.
e.g. params_image_auto_search_space="{'augmentation': ['safe'], 'crop_strategy': ['Resize'], 'optimizer': ['AdamW'], 'dropout': [0.1], 'epochs_per_stage': [5], 'warmup_epochs': [0], 'mixup': [0.0], 'cutmix': [0.0], 'global_pool': ['avg'], 'learning_rate': [3e-4]}"
Options, e.g. used for time>=8
# Overfit Protection Options:
‘augmentation’: ["safe", "semi_safe", "hard"]
‘crop_strategy’: ["Resize", "RandomResizedCropSoft", "RandomResizedCropHard"]
‘dropout’: [0.1, 0.3, 0.5]
# Global Pool Options:
avgmax – sum of AVG and MAX poolings catavgmax – concatenation of AVG and MAX poolings https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/adaptive_avgmax_pool.py
'global_pool': ['avg', 'avgmax', 'catavgmax']
- # Regression: No MixUp and CutMix:
'mixup': [0.0]
'cutmix': [0.0]
- # Classification: Beta distribution coeff to generate weights for MixUp:
'mixup': [0.0, 0.4, 1.0, 3.0]
'cutmix': [0.0, 0.4, 1.0, 3.0]
# Optimization Options:
'epochs_per_stage': [5, 10, 15]
# from 40 to 135 epochs
'warmup_epochs': [0, 0.5, 1]
'optimizer': ["AdamW", "SGD"]
'learning_rate': [1e-3, 3e-4, 1e-4]
image_auto_arch
Architectures for image auto (List) (Expert Setting)
Default value []
- Nominally, the accuracy dial controls the architectures considered if this is left empty,
but one can choose specific ones. The options in the list are ordered by complexity.
image_auto_min_shape
Minimum image size (Number) (Expert Setting)
Default value 64
Any images smaller are upscaled to the minimum. Default is 64, but can be as small as 32 given the pooling layers used.
image_auto_num_final_models
Number of models in final ensemble (Number) (Expert Setting)
Default value 0
0 means automatic based upon time dial of min(1, time//2).
image_auto_num_models
Number of models in search space (Number) (Expert Setting)
Default value 0
0 means automatic based upon time dial of max(4 * (time - 1), 2).
image_auto_num_stages
Number of stages for hyperparameter search (Number) (Expert Setting)
Default value 0
0 means automatic based upon time dial of time + 1 if time < 6 else time - 1.
image_auto_iterations
Number of iterations for successive halving (Number) (Expert Setting)
Default value 0
- 0 means automatic based upon time dial or number of models and stages
set by image_auto_num_models and image_auto_num_stages.
image_auto_shape_factor
Image downscale ratio to use for training (Float) (Expert Setting)
Default value 0.0
- 0.0 means automatic based upon the current stage, where stage 0 uses half, stage 1 uses 3/4, and stage 2 uses full image.
One can pass 1.0 to override and always use full image. 0.5 would mean use half.
max_image_auto_ddp_cores
Maximum number of cores to use for image auto model parallel data management (Number) (Expert Setting)
Default value 10
Control maximum number of cores to use for image auto model parallel data management. 0 will disable mp: https://pytorch-lightning.readthedocs.io/en/latest/guides/speed.html