Data Recipe URL Setup¶
Driverless AI lets you explore data recipe URL data sources from within the Driverless AI application. This section provides instructions for configuring Driverless AI to work with data recipe URLs. When enabled (default), you will be able to modify datasets that have been added to Driverless AI. (Refer to Modify by custom data recipe for more information.)
Notes:
This connector is enabled by default. These steps are provided in case this connector was previously disabled and you want to re-enable it.
Depending on your Docker install version, use either the
docker run --runtime=nvidia
(>= Docker 19.03) ornvidia-docker
(< Docker 19.03) command when starting the Driverless AI Docker image. Usedocker version
to check which version of Docker you are using.
Enable Data Recipe URL¶
This example enables the data recipe URL data connector.
nvidia-docker run \
--shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 \
--add-host name.node:172.16.2.186 \
-e DRIVERLESS_AI_ENABLED_FILE_SYSTEMS="file, recipe_url" \
-p 12345:12345 \
-it --rm \
-v /tmp/dtmp/:/tmp \
-v /tmp/dlog/:/log \
-v /tmp/dlicense/:/license \
-v /tmp/ddata/:/data \
-u $(id -u):$(id -g) \
h2oai/dai-ubi8-x86_64:1.11.1-cuda11.8.0.xx
This example shows how to enable the Data Recipe URL data connector in the config.toml file, and then specify that file when starting Driverless AI in Docker. Note that recipe_url
is enabled in the config.toml file by default.
Configure the Driverless AI config.toml file. Set the following configuration options.
enabled_file_systems = "file, upload, recipe_url"
Mount the config.toml file into the Docker container.
nvidia-docker run \ --pid=host \ --rm \ --shm-size=2g --cap-add=SYS_NICE --ulimit nofile=131071:131071 --ulimit nproc=16384:16384 \ --add-host name.node:172.16.2.186 \ -e DRIVERLESS_AI_CONFIG_FILE=/path/in/docker/config.toml \ -p 12345:12345 \ -v /local/path/to/config.toml:/path/in/docker/config.toml \ -v /etc/passwd:/etc/passwd:ro \ -v /etc/group:/etc/group:ro \ -v /tmp/dtmp/:/tmp \ -v /tmp/dlog/:/log \ -v /tmp/dlicense/:/license \ -v /tmp/ddata/:/data \ -u $(id -u):$(id -g) \ h2oai/dai-ubi8-x86_64:1.11.1-cuda11.8.0.xx
This example enables the Data Recipe URL data connector. Note that recipe_url
is enabled by default.
Export the Driverless AI config.toml file or add it to ~/.bashrc. For example:
# DEB and RPM export DRIVERLESS_AI_CONFIG_FILE="/etc/dai/config.toml" # TAR SH export DRIVERLESS_AI_CONFIG_FILE="/path/to/your/unpacked/dai/directory/config.toml"
Specify the following configuration options in the config.toml file.
# File System Support # upload : standard upload feature # file : local file system/server file system # hdfs : Hadoop file system, remember to configure the HDFS config folder path and keytab below # dtap : Blue Data Tap file system, remember to configure the DTap section below # s3 : Amazon S3, optionally configure secret and access key below # gcs : Google Cloud Storage, remember to configure gcs_path_to_service_account_json below # gbq : Google Big Query, remember to configure gcs_path_to_service_account_json below # minio : Minio Cloud Storage, remember to configure secret and access key below # snow : Snowflake Data Warehouse, remember to configure Snowflake credentials below (account name, username, password) # kdb : KDB+ Time Series Database, remember to configure KDB credentials below (hostname and port, optionally: username, password, classpath, and jvm_args) # azrbs : Azure Blob Storage, remember to configure Azure credentials below (account name, account key) # jdbc: JDBC Connector, remember to configure JDBC below. (jdbc_app_configs) # hive: Hive Connector, remember to configure Hive below. (hive_app_configs) # recipe_url: load custom recipe from URL # recipe_file: load custom recipe from local file system enabled_file_systems = "file, recipe_url"
Save the changes when you are done, then stop/restart Driverless AI.