Google Cloud Storage Setup

This section provides instructions for configuring Driverless AI to work with Google Cloud Storage. This setup requires you to enable authentication. If you enable GCS or GBP connectors, those file systems will be available in the UI, but you will not be able to use those connectors without authentication.

In order to enable the GCS data connector with authentication, you must:

  1. Obtain a JSON authentication file from GCP.

  2. Mount the JSON file to the Docker instance.

  3. Specify the path to the /json_auth_file.json in the gcs_path_to_service_account_json configuration option.

Note: The account JSON includes authentications as provided by the system administrator. You can be provided a JSON file that contains both Google Cloud Storage and Google BigQuery authentications, just one or the other, or none at all.

Description of Configuration Attributes

  • gcs_path_to_service_account_json: Specifies the path to the /json_auth_file.json file.

  • gcs_init_path: Specifies the starting GCS path displayed in the UI of the GCS browser.

GCS with Authentication

This example enables the GCS data connector with authentication by passing the JSON authentication file. This assumes that the JSON file contains Google Cloud Storage authentications.

  1. Export the Driverless AI config.toml file or add it to ~/.bashrc. For example:

# DEB and RPM
export DRIVERLESS_AI_CONFIG_FILE="/etc/dai/config.toml"

# TAR SH
export DRIVERLESS_AI_CONFIG_FILE="/path/to/your/unpacked/dai/directory/config.toml"
  1. Specify the following configuration options in the config.toml file.

# File System Support
# upload : standard upload feature
# file : local file system/server file system
# hdfs : Hadoop file system, remember to configure the HDFS config folder path and keytab below
# dtap : Blue Data Tap file system, remember to configure the DTap section below
# s3 : Amazon S3, optionally configure secret and access key below
# gcs: Google Cloud Storage, remember to configure gcs_path_to_service_account_json below
# gbq: Google Big Query, remember to configure gcs_path_to_service_account_json below
# minio: Minio Cloud Storage, remember to configure secret and access key below
# snow: Snowflake Data Warehouse, remember to configure Snowflake credentials below (account name, username, password)
# kdb: KDB+ Time Series Database, remember to configure KDB credentials below (hostname and port, optionally: username, password, classpath, and jvm_args)
# azrbs: Azure Blob Storage, remember to configure Azure credentials below (account name, account key)
# jdbc: JDBC Connector, remember to configure JDBC below. (jdbc_app_configs)
# hive: Hive Connector, remember to configure Hive below. (hive_app_configs)
# recipe_url: load custom recipe from URL
# recipe_file: load custom recipe from local file system
enabled_file_systems = "file, gcs"

# GCS Connector credentials
# example (suggested) -- "/licenses/my_service_account_json.json"
gcs_path_to_service_account_json = "/service_account_json.json"
  1. Save the changes when you are done, then stop/restart Driverless AI.