Driverless AI Security

Objective

The goal of this document is to describe different aspects of Driverless AI security and to provide guidelines to secure the system by reducing its surface of vulnerability.

This section covers the following areas of the product:

  • User access

  • Authentication

  • Authorization

  • Configuration security (Also see Configuration Security)

  • Data security

  • Data import

  • Data export

  • Logs

  • User data isolation

  • Transfer security

  • Custom recipes security

  • Web UI security

Important things to know

Warning

WARNING Security in a default installation of Driverless AI is DISABLED! By default, a Driverless AI installation targets ease-of- use and does not enable all security features listed in this document. For production environments, we recommend following this document and performing a secure Driverless AI installation.


User Access

Authentication Methods

Option

Default Value

Recommended Value

Description

authentication_method

"unvalidated"

Any supported authentication (e.g., LDAP, PAM) method except "unvalidated" and "none".

Define user authentication method.

authentication_default_timeout_hours

72

Consult your security requirements.

Number of hours after which a user has to relogin.

Authorization Methods

At this point, Driverless AI does not perform any authorization.


Data Security

Data Import

Option

Default Value

Recommended Value

Description

enabled_file_systems

"upload, file, hdfs, s3"

Configure only needed data sources.

Control list of available/configured data sources.

max_file_upload_size

104857600000B

Configure based on expected file size and size of Driverless AI deployment.

Limit maximum size of uploaded file.

supported_file_types

see config.toml

It is recommended to limit file types to extension used in the target environment (e.g., parquet).

Supported file formats listed in filesystem browsers.

show_all_filesystems

true

false

Show all available data sources in WebUI (even though there are not configured). It is recommended to show only configured data sources.

Data Export

Option

Default Value

Recommended Value

Description

enable_dataset_downloading

true

false (disable download of datasets)

Control ability to download any datasets (uploaded, predictions, MLI). Note: if dataset download is disabled, we strongly suggest to disable custom recipes as well to remove another way how data could be exported from the application.

enable_artifacts_upload

false

false

Replace all downloads on the experiment page to “exports”, and allow users to push to the artifact store configured with artifacts_store. (See notes below.)

artifacts_store

file_system

file_system

Stores a MOJO on a file system directory denoted by artifacts_file_system_directory. (See notes below.)

artifacts_file_system_directory

tmp

tmp

File system location where artifacts will be copied in case artifacts_store is set to file_system. (See notes below.)

Notes about Artifacts:

  • Currently, file_system is the only option that can be specified for artifacts_store. Additional options will be available in future releases.

  • The location for artifacts_file_system_directory is expected to be a directory on your server.

  • When these artifacts are enabled/configured, the menu options on the Completed Experiment page will change. Specifically, all “Download” options (with the exception of Autoreport) will change to “Export.” Refer to Export Artifacts for more information.

Completed experiments menus

Logs

The Driverless AI produces several logs:
  • audit logs

  • server logs

  • experiment logs

The administrator of Driverless AI application (i.e., person who is responsible for configuration and setup of the application) has control over content which is written to the logs.

Option

Default Value

Recommended Value

Description

audit_log_retention_period

5 (days)

0 (disable audit log rotation)

Number of days to keep audit logs. The value 0 disable rotation.

do_not_log_list

see config.toml

Contain list of configuration options which are not recorded in logs.

log_level

1

see config.toml

Define verbosity of logging

collect_server_logs_in_experiment_logs

false

false

Dump server logs with experiment. Dangerous since server logs can contain information about experiments of other users using Driverless AI.

h2o_recipes_log_level

None

Log level for OSS H2O instances used by custom recipes.

debug_log

false

false

Enable debug logs.

write_recipes_to_experiment_logger

false

false

Dump a custom recipe source code into logs.

User Data Isolation

Option

Default Value

Recommended Value

Description

data_directory

"./tmp"

Specify proper name and location of directory.

Directory where Driverless AI stores all computed experiments and datasets

file_hide_data_directory

true

true

Hide data_directory in file-system browser. It is recommended to hide it to protect data_directory from browsing and corruption.

file_path_filtering_enabled

false

true

Enable path filter for file-system browser (file data source). By default the filter is disabled which means users can browse the entire application-local filesystem.

file_path_filter_include

[]

It is recommended to predefine a list of paths which user can access in a file-browser. For example, ['/home', '/data'].

List of absolute path prefixes to restrict access to in file-browser.


Client-Server Communication Security

Option

Default Value

Recommended Value

Description

enable_https

false

true

Enable HTTPS

ssl_key_file

"/etc/dai/private_key.pem"

Correct private key.

Private key to setup HTTPS/SSL communication.

ssl_crt_file

"/etc/dai/cert.pem"

Correct public certifikate.

Public certificate to setup HTTPS/SSL.

ssl_no_sslv2

true

true

Prevents an SSLv2 connection.

ssl_no_sslv3

true

true

Prevents an SSLv3 connection.

ssl_no_tlsv1

true

true

Prevents an TLSv1 connectiona.

ssl_no_tlsv1_1

true

true

Prevents an TLSv1.1 connection.

ssl_no_tlsv1_2

false

false (disable TLSv1.2 only if TLSv1.3 is available)

Prevents a TLSv1.2 connection.

ssl_no_tlsv1_3

false

false

Prevents a TLSv1.3 connection.

Response Headers

The response headers which are passed between Driverless AI server and client (browser, Python/R clients) are controlled via the following option:

Option

Default Value

Recommended Value

Description

extra_http_headers

"{}"

See below

Configure HTTP header returned in server response.

Other Headers to Consider

Header

Documentation

Public-Key-Pins CORS-related headers

https://developer.mozilla.org/en-US/docs/Web/HTTP/Public_Key_Pinning https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS


Web UI Security

Note

The Driverless AI UI is design to be user-friendly, and by default all features like auto-complete are enabled. Disabling the user-friendly features increases security of the application, but impacts user-friendliness and usability of the application.

Option

Default Value

Recommended Value

Description

allow_form_autocomplete

true

false

Control auto-completion in Web UI elements (e.g., login inputs).

allow_localstorage

true

false

Disable use of web browser local storage.

show_all_filesystems

true

false

Show all available data sources in WebUI (even though there are not configured). It is recommended to show only configured data sources.

verify_session_ip

false

true

Verifies each request IP against IP which initialized the session.

allow_concurrent_sessions

true

false

Disable concurrent sessions (logins).

enable_xsrf_protection

true

true

Enable XSRF (cross-site request forgery) protection.

enable_secure_cookies

false

true

Enable SECURE cookie flag. Note that HTTPS must be enabled.


Custom Recipe Security

Note

By default Driverless AI enables custom recipes as a main route for the way data-science teams can extend the application capabilities. In enterprise environments, it is recommended to follow best software engineering practices for development of custom recipes (i.e., code reviews, testing, stage releasing, etc.) and bundle only a pre-defined and approved set of custom Driverless AI extensions.

Option

Default Value

Recommended Value

Description

enable_custom_recipes

true

false

Enable custom Python recipes.

enable_custom_recipes_upload

true

false

Enable uploading of custom recipes.

enable_custom_recipes_from_url

true

false

Enable downloading of custom recipes from external URL.

include_custom_recipes_by_default

false

false

Include custom recipes in default inclusion lists. (warning: enables all custom recipes)

custom_recipe_security_analysis_enabled

false

true

Enable code static analysis for custom recipes

Note

custom_recipe_security_analysis_enabled is disabled by default, because enabling it can cause problems with old experiments using recipes, which cannot pass security analysis. Due to that, you may not be able to score new datasets with those experiments and you’ll need to re-upload the recipes again. It is still recommended to enable this options for security reasons.


Configuration Security

Driverless AI provides the option to store sensitive or secure configuration information in an encrypted keystore as an alternative to keeping security settings as clear text in the config.toml file. Encrypting sensitive information is highly recommended. For more information, see Configuration Security.

Note: You can prevent specific configurations from being recorded in logs with the do_not_log_list config.toml setting.

Baseline Secure Configuration

The following Driverless AI configuration is an example of secure configuration. Ensure that all necessary config options are specified. For more information, see Using the config.toml File.

#
# Authentication
#

# Configure auth method
authentication_method="PAM"

# Redirect user to login page after 24 hours
authentication_default_timeout_hours=24

#
# Data
#

# Configure available connectors
enabled_file_systems="hdfs"
show_all_filesystems=false

# Restrict downloads
enable_dataset_downloading=false

#
# Logs
#

# Enable removal of audit log records every five days
audit_log_retention_period=5

# Disable collection of server logs
collect_server_logs_in_experiment_logs=false

#
# User data isolation
#

# Disable access to DAI data_directory from file browser
file_hide_data_directory=true

# (Optional) Enable usage of path filters
# file_path_filtering_enabled=true

# (Optional) Specify a list of absolute path prefixes to restrict access to in file browser
# file_path_filter_include = "['/data']")

#
# Client-Server Communication
#

enable_https=true
ssl_key_file="<<FILL ME>>"
ssl_crt_file="<<FILL ME>>"

# (Optional) Disable support of TLSv1.2 on server side only if your environment supports TLSv1.3
# ssl_no_tlsv1_2=true

#
# Web UI security
#

allow_form_autocomplete=false
allow_localstorage=false
verify_session_ip=true
allow_concurrent_sessions=false
enable_xsrf_protection=true

extra_http_headers='{ "Strict-Transport-Security":"max-age=63072000","Content-Security-Policy":"default-src https: ; font-src \'self\'; script-src \'self\' \'unsafe-eval\' \'unsafe-inline\'; style-src \'self\' \'unsafe-inline\'; object-src \'none\'", "X-Frame-Options":"deny", "X-Content-Type-Options":"nosniff", "X-XSS-Protection":"1; mode=block" }'

#
# Custom Recipes
#

enable_custom_recipes=false
enable_custom_recipes_upload=false
enable_custom_recipes_from_url=false
include_custom_recipes_by_default=false
custom_recipe_security_analysis_enabled=true