Image Processing in Driverless AI

Image processing in Driverless AI is a powerful tool that can be used to gain insight from digital images. This section describes Driverless AI’s image processing capabilities.

Uploading Data for Image Processing

Driverless AI supports multiple methods for uploading image datasets:

  • Archive with images in directories for each class. Labels for each class are automatically created based on directory hierarchy

  • Archive with images and a CSV file that contains at least one column with relative image paths and a target column (best method for regression)

  • CSV file with local paths to the images on the disk

  • CSV file with remote URLs to the images

Modeling Images

Driverless AI features two different approaches to modeling images.

Embeddings Transformer (Image Vectorizer)

The Image Vectorizer transformer utilizes pre-trained ImageNet models to convert a column with an image path or URI to an embeddings (vector) representation that is derived from the last global average pooling layer of the model. The resulting vector is then used for modeling in Driverless AI.

There are several options in the Expert Settings panel that allow you to configure the Image Vectorizer transformer. This panel is available from within the experiment page above the Scorer knob. Refer to Image Settings for more information on these options.

Notes:

  • This modeling approach supports classification and regression experiments.

  • This modeling approach supports the use of mixed data types (any number of image columns, text columns, numeric or categorical columns)

Automatic Image Model

Automatic Image Model is an AutoML model that accepts only an image and a label as input features. This model automatically selects hyperparameters such as learning rate, optimizer, batch size, and image input size. It also automates the training process by selecting the number of epochs, cropping strategy, augmentations, and learning rate scheduler.

Automatic Image Model uses pre-trained ImageNet models and starts the training process from them. The possible architectures list includes all the well-known models: (SE)-ResNe(X)ts; DenseNets; EfficientNets; Inceptions; etc.

Unique insights that provide information and sample images for the current best individual model are available for Automatic Image Model. To view these insights, click on the Insights option while an experiment is running or after an experiment is complete. Refer to ImageAuto Model Insights for more information.

Each individual model score (together with the neural network architecture name) is available in the Iteration Data panel. The last point in the Iteration Data is always called ENSEMBLE_TTA. This indicates that the final model ensembles multiple individual models and applies Test Time Augmentations (TTA).

Enabling Automatic Image Model

To enable Automatic Image Model, navigate to the Pipeline Building Recipe expert setting and select the image_model option:

Enable Automatic Image Model

After confirming your selection, click Save. The experiment preview section updates to include information about Automatic Image Model:

Automatic Image Model Preview

Notes:

  • This modeling approach only supports a single image column as an input.

  • This modeling approach does not support any transformers.

  • This modeling approach supports classification and regression experiments.

  • This modeling approach does not support the use of mixed data types because of its limitation on input features.

  • This modeling approach does not use Genetic Algorithm (GA).

  • The use of one or more GPUs is strongly recommended for this modeling approach.

  • If an internet connection is available, ImageNet pretrained weights are downloaded automatically. If an internet connection is not available, weights must be downloaded from http://s3.amazonaws.com/artifacts.h2o.ai/releases/ai/h2o/pretrained/autoimage_weights.zip and extracted into ./tmp or tensorflow_image_pretrained_models_dir (specified in the config.toml file).

Deploy Image Model

Python scoring and C++ MOJO scoring are supported for the image transformer. Presently, only python scoring is supported for image models