Google Cloud Storage Setup

Driverless AI lets you explore Google Cloud Storage data sources from within the Driverless AI application. This section provides instructions for configuring Driverless AI to work with Google Cloud Storage. This setup requires you to enable authentication. If you enable GCS or GBP connectors, those file systems will be available in the UI, but you will not be able to use those connectors without authentication.

In order to enable the GCS data connector with authentication, you must:

  1. Obtain a JSON authentication file from GCP.

  2. Mount the JSON file to the Docker instance.

  3. Specify the path to the /json_auth_file.json in the gcs_path_to_service_account_json config option.

Notes:

  • The account JSON includes authentications as provided by the system administrator. You can be provided a JSON file that contains both Google Cloud Storage and Google BigQuery authentications, just one or the other, or none at all.

  • Depending on your Docker install version, use either the docker run --runtime=nvidia (>= Docker 19.03) or nvidia-docker (< Docker 19.03) command when starting the Driverless AI Docker image. Use docker version to check which version of Docker you are using.

Description of Configuration Attributes

  • GCS Connector service account JSON (gcs_service_account_json): Specify your GCS Connector service account credentials in JSON. Note that this configuration option takes precedence over the gcs_path_to_service_account_json configuration option. This configuration option can be accessed from the Expert Settings panel in the Connectors > GBQ tab.

  • gcs_path_to_service_account_json: Specifies the path to the /json_auth_file.json file.

  • GCS Connector impersonated account (gbq_access_impersonated_account): Specifies which user to impersonate if the main configured service account allows it. This configuration option can be accessed from the Expert Settings panel in the Connectors > GBQ tab.

  • gcs_init_path: Specifies the starting GCS path displayed in the UI of the GCS browser.

Start GCS with Authentication

This example enables the GCS data connector with authentication by passing the JSON authentication file. This assumes that the JSON file contains Google Cloud Storage authentications.

 nvidia-docker run \
     --pid=host \
     --init \
     --rm \
     --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 \
     -e DRIVERLESS_AI_ENABLED_FILE_SYSTEMS="file,gcs" \
     -e DRIVERLESS_AI_GCS_PATH_TO_SERVICE_ACCOUNT_JSON="/service_account_json.json" \
     -u `id -u`:`id -g` \
     -p 12345:12345 \
     -v `pwd`/data:/data \
     -v `pwd`/log:/log \
     -v `pwd`/license:/license \
     -v `pwd`/tmp:/tmp \
     -v `pwd`/service_account_json.json:/service_account_json.json \
     h2oai/dai-ubi8-x86_64:1.11.1-cuda11.8.0.xx