Driverless AI Security¶

Objective¶

The goal of this document is to describe different aspects of Driverless AI security and to provide guidelines to secure the system by reducing its surface of vulnerability.

This section covers the following areas of the product:

User access

Authentication

Authorization

Configuration security (Also see Configuration Security)

Data security

Data import

Data export

Logs

User data isolation

Transfer security

Custom recipes security

Web UI security

Important things to know¶

Warning

WARNING Security in a default installation of Driverless AI is DISABLED! By default, a Driverless AI installation targets ease-of- use and does not enable all security features listed in this document. For production environments, we recommend following this document and performing a secure Driverless AI installation.

User Access¶

Authentication Methods¶

Option	Default Value	Recommended Value	Description
`authentication_method`	`"unvalidated"`	Any supported authentication (e.g., LDAP, PAM) method except `"unvalidated"` and `"none"`.	Define user authentication method.
`authentication_default_timeout_hours`	`72`	Consult your security requirements.	Number of hours after which a user has to relogin.

Authorization Methods¶

At this point, Driverless AI does not perform any authorization.

Data Security¶

Data Import¶

Option	Default Value	Recommended Value	Description
`enabled_file_systems`	`"upload, file, hdfs, s3"`	Configure only needed data sources.	Control list of available/configured data sources.
`max_file_upload_size`	`104857600000B`	Configure based on expected file size and size of Driverless AI deployment.	Limit maximum size of uploaded file.
`supported_file_types`	see config.toml	It is recommended to limit file types to extension used in the target environment (e.g., `parquet`).	Supported file formats listed in filesystem browsers.
`show_all_filesystems`	`true`	`false`	Show all available data sources in WebUI (even though there are not configured). It is recommended to show only configured data sources.

Data Export¶

Option	Default Value	Recommended Value	Description
`enable_dataset_downloading`	`true`	`false` (disable download of datasets)	Control ability to download any datasets (uploaded, predictions, MLI). Note: if dataset download is disabled, we strongly suggest to disable custom recipes as well to remove another way how data could be exported from the application.
`enable_artifacts_upload`	`false`	`false`	Replace all downloads on the experiment page to “exports”, and allow users to push to the artifact store configured with `artifacts_store`. (See notes below.)
`artifacts_store`	`file_system`	`file_system`	Stores a MOJO on a file system directory denoted by `artifacts_file_system_directory`. (See notes below.)
`artifacts_file_system_directory`	`tmp`	`tmp`	File system location where artifacts will be copied in case `artifacts_store` is set to `file_system`. (See notes below.)

Notes about Artifacts:

Currently, file_system is the only option that can be specified for artifacts_store. Additional options will be available in future releases.
The location for artifacts_file_system_directory is expected to be a directory on your server.
When these artifacts are enabled/configured, the menu options on the Completed Experiment page will change. Specifically, all “Download” options (with the exception of Autoreport) will change to “Export.” Refer to Export Artifacts for more information.

Logs¶

The Driverless AI produces several logs:

audit logs
server logs
experiment logs

The administrator of Driverless AI application (i.e., person who is responsible for configuration and setup of the application) has control over content which is written to the logs.

Option	Default Value	Recommended Value	Description
`audit_log_retention_period`	`5` (days)	`0` (disable audit log rotation)	Number of days to keep audit logs. The value `0` disable rotation.
`do_not_log_list`	see config.toml	—	Contain list of configuration options which are not recorded in logs.
`log_level`	`1`	see config.toml	Define verbosity of logging
`collect_server_logs_in_experiment_logs`	`false`	`false`	Dump server logs with experiment. Dangerous since server logs can contain information about experiments of other users using Driverless AI.
`h2o_recipes_log_level`	`None`	—	Log level for OSS H2O instances used by custom recipes.
`debug_log`	`false`	`false`	Enable debug logs.
`write_recipes_to_experiment_logger`	`false`	`false`	Dump a custom recipe source code into logs.

User Data Isolation¶

Option	Default Value	Recommended Value	Description
`data_directory`	`"./tmp"`	Specify proper name and location of directory.	Directory where Driverless AI stores all computed experiments and datasets
`file_hide_data_directory`	`true`	`true`	Hide `data_directory` in `file`-system browser. It is recommended to hide it to protect `data_directory` from browsing and corruption.
`file_path_filtering_enabled`	`false`	`true`	Enable path filter for `file`-system browser (`file` data source). By default the filter is disabled which means users can browse the entire application-local filesystem.
`file_path_filter_include`	`[]`	It is recommended to predefine a list of paths which user can access in a `file`-browser. For example, `['/home', '/data']`.	List of absolute path prefixes to restrict access to in `file`-browser.

Client-Server Communication Security¶

Option	Default Value	Recommended Value	Description
`enable_https`	`false`	`true`	Enable HTTPS
`ssl_key_file`	`"/etc/dai/private_key.pem"`	Correct private key.	Private key to setup HTTPS/SSL communication.
`ssl_crt_file`	`"/etc/dai/cert.pem"`	Correct public certifikate.	Public certificate to setup HTTPS/SSL.
`ssl_no_sslv2`	`true`	`true`	Prevents an SSLv2 connection.
`ssl_no_sslv3`	`true`	`true`	Prevents an SSLv3 connection.
`ssl_no_tlsv1`	`true`	`true`	Prevents an TLSv1 connectiona.
`ssl_no_tlsv1_1`	`true`	`true`	Prevents an TLSv1.1 connection.
`ssl_no_tlsv1_2`	`false`	`false` (disable TLSv1.2 only if TLSv1.3 is available)	Prevents a TLSv1.2 connection.
`ssl_no_tlsv1_3`	`false`	`false`	Prevents a TLSv1.3 connection.

Response Headers¶

The response headers which are passed between Driverless AI server and client (browser, Python/R clients) are controlled via the following option:

Option	Default Value	Recommended Value	Description
`extra_http_headers`	`"{}"`	See below	Configure HTTP header returned in server response.

Recommended Response Headers¶

Header	Description	Example value	Link
`Strict-Transport-Security`	The header lets a web site tell browsers that it should only be accessed using HTTPS, instead of using HTTP. The `max-age` specifies time, in seconds, that the browser should remember that a site is only to be accessed using HTTPS.	`max-age=63072000`	https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Strict-Transport-Security
`Content-Security-Policy`	Content Security Policy (CSP) is an added layer of security that helps to detect and mitigate certain types of attacks, including Cross Site Scripting and data injection attacks. Controls from where the page can download source.	`default-src https: ; font-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; object-src 'none'` Note: The Driverless AI is still requires to have `unsafe-eval` and `unsafe-inline` configured, which potentionally makes the server vulnerable to XSS attacks.	https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP https://infosec.mozilla.org/guidelines/web_security#Examples_5
`X-Frame-Options`	Controls where a page can get source to render in a frame. The value here overrides the default, which is `SAMEORIGIN`.	`deny`	https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Frame-Options
`X-Content-Type-Options`	Prevents the browser from trying to determine the content-type of a resource that is different than the declared content-type.	`nosniff`	https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Content-Type-Options
`X-XSS-Protection`	The HTTP X-XSS-Protection response header is a feature of Internet Explorer, Chrome and Safari that stops pages from loading when they detect reflected cross-site scripting (XSS) attacks. When value is set to 1 and a cross-site scripting attack is detected, the browser will sanitize the page (remove the unsafe parts).	`1; mode=block`	https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-XSS-Protection

Other Headers to Consider¶

Header	Documentation
`Public-Key-Pins` CORS-related headers	https://developer.mozilla.org/en-US/docs/Web/HTTP/Public_Key_Pinning https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS

Web UI Security¶

Note

The Driverless AI UI is design to be user-friendly, and by default all features like auto-complete are enabled. Disabling the user-friendly features increases security of the application, but impacts user-friendliness and usability of the application.

Option	Default Value	Recommended Value	Description
`allow_form_autocomplete`	`true`	`false`	Control auto-completion in Web UI elements (e.g., login inputs).
`allow_localstorage`	`true`	`false`	Disable use of web browser local storage.
`show_all_filesystems`	`true`	`false`	Show all available data sources in WebUI (even though there are not configured). It is recommended to show only configured data sources.
`verify_session_ip`	`false`	`true`	Verifies each request IP against IP which initialized the session.
`allow_concurrent_sessions`	`true`	`false`	Disable concurrent sessions (logins).
`enable_xsrf_protection`	`true`	`true`	Enable XSRF (cross-site request forgery) protection.
`enable_secure_cookies`	`false`	`true`	Enable SECURE cookie flag. Note that HTTPS must be enabled.

Custom Recipe Security¶

Note

By default Driverless AI enables custom recipes as a main route for the way data-science teams can extend the application capabilities. In enterprise environments, it is recommended to follow best software engineering practices for development of custom recipes (i.e., code reviews, testing, stage releasing, etc.) and bundle only a pre-defined and approved set of custom Driverless AI extensions.

Option	Default Value	Recommended Value	Description
`enable_custom_recipes`	`true`	`false`	Enable custom Python recipes.
`enable_custom_recipes_upload`	`true`	`false`	Enable uploading of custom recipes.
`enable_custom_recipes_from_url`	`true`	`false`	Enable downloading of custom recipes from external URL.
`include_custom_recipes_by_default`	`false`	`false`	Include custom recipes in default inclusion lists. (warning: enables all custom recipes)
`custom_recipe_security_analysis_enabled`	`false`	`true`	Enable code static analysis for custom recipes

Note

custom_recipe_security_analysis_enabled is disabled by default, because enabling it can cause problems with old experiments using recipes, which cannot pass security analysis. Due to that, you may not be able to score new datasets with those experiments and you’ll need to re-upload the recipes again. It is still recommended to enable this options for security reasons.

Configuration Security¶

Driverless AI provides the option to store sensitive or secure configuration information in an encrypted keystore as an alternative to keeping security settings as clear text in the config.toml file. Encrypting sensitive information is highly recommended. For more information, see Configuration Security.

Note: You can prevent specific configurations from being recorded in logs with the do_not_log_list config.toml setting.

Baseline Secure Configuration¶

The following Driverless AI configuration is an example of secure configuration. Ensure that all necessary config options are specified. For more information, see Using the config.toml File.

#
# Authentication
#

# Configure auth method
authentication_method="PAM"

# Redirect user to login page after 24 hours
authentication_default_timeout_hours=24

#
# Data
#

# Configure available connectors
enabled_file_systems="hdfs"
show_all_filesystems=false

# Restrict downloads
enable_dataset_downloading=false

#
# Logs
#

# Enable removal of audit log records every five days
audit_log_retention_period=5

# Disable collection of server logs
collect_server_logs_in_experiment_logs=false

#
# User data isolation
#

# Disable access to DAI data_directory from file browser
file_hide_data_directory=true

# (Optional) Enable usage of path filters
# file_path_filtering_enabled=true

# (Optional) Specify a list of absolute path prefixes to restrict access to in file browser
# file_path_filter_include = "['/data']")

#
# Client-Server Communication
#

enable_https=true
ssl_key_file="<<FILL ME>>"
ssl_crt_file="<<FILL ME>>"

# (Optional) Disable support of TLSv1.2 on server side only if your environment supports TLSv1.3
# ssl_no_tlsv1_2=true

#
# Web UI security
#

allow_form_autocomplete=false
allow_localstorage=false
verify_session_ip=true
allow_concurrent_sessions=false
enable_xsrf_protection=true

extra_http_headers='{ "Strict-Transport-Security":"max-age=63072000","Content-Security-Policy":"default-src https: ; font-src \'self\'; script-src \'self\' \'unsafe-eval\' \'unsafe-inline\'; style-src \'self\' \'unsafe-inline\'; object-src \'none\'", "X-Frame-Options":"deny", "X-Content-Type-Options":"nosniff", "X-XSS-Protection":"1; mode=block" }'

#
# Custom Recipes
#

enable_custom_recipes=false
enable_custom_recipes_upload=false
enable_custom_recipes_from_url=false
include_custom_recipes_by_default=false
custom_recipe_security_analysis_enabled=true