Deployment configuration¶
deployment_aws_access_key_id
¶
deployment_aws_access_key_id (String)
Default value ''
Default AWS credentials to be used for scorer deployments.
deployment_aws_secret_access_key
¶
deployment_aws_secret_access_key (String)
Default value ''
Default AWS credentials to be used for scorer deployments.
deployment_aws_bucket_name
¶
deployment_aws_bucket_name (String)
Default value ''
AWS S3 bucket to be used for scorer deployments.
triton_benchmark_runtime
¶
triton_benchmark_runtime (Number)
Default value 5
Approximate upper limit of time for Triton to take to compute latency and throughput performance numbers when performing ‘Benchmark’ operations for a deployment. Higher values result in more accurate performance numbers.
triton_quick_test_runtime
¶
triton_quick_test_runtime (Number)
Default value 2
Approximate upper limit of time for Triton to take to compute latency and throughput performance numbers after loading up the deployment, per model. Higher values result in more accurate performance numbers.
deploy_wizard_num_per_page
¶
deploy_wizard_num_per_page (Number)
Default value 10
Number of Triton deployments to show per page of the Deploy Wizard
triton_host_local
¶
Hostname of built-in Triton inference server. (String) (Expert Setting)
Default value ''
Hostname (or IP address) of built-in Triton inference service, to be used when auto_deploy_triton_scoring_pipeline and make_triton_scoring_pipeline are not disabled. Only needed if enable_triton_server_local is disabled. Required to be set for some systems, like AWS, for networking packages to reach the server.
triton_server_params_local
¶
Built-in Triton server command line arguments. (Dict) (Expert Setting)
Default value {'model-control-mode': 'explicit', 'http-port': 8000, 'grpc-port': 8001, 'metrics-port': 8002, 'rate-limit': 'execution_count'}
Set Triton server command line arguments passed with –key=value.
triton_model_repository_dir_local
¶
Path to Triton model repository. (String) (Expert Setting)
Default value 'triton-model-repository'
Path to model repository (relative to data_directory) for local Triton inference server built-in to Driverless AI. All Triton deployments for all users are stored in this directory.
triton_server_core_chunk_size_local
¶
Number of cores to use for each model. (Number) (Expert Setting)
Default value 4
- Number of cores to specify as resource, so that C++ MOJO can use its own multi-threaded parallel row batching to save memory and increase performance.
A value of 1 is most portable across any Triton server, and is the most efficient use of resources for small (e.g. 1) batch sizes, while 4 is reasonable default assuming requests are batched.
triton_host_remote
¶
Hostname of remote Triton inference server. (String) (Expert Setting)
Default value ''
Hostname (or IP address) of remote Triton inference service (outside of DAI), to be used when auto_deploy_triton_scoring_pipeline and make_triton_scoring_pipeline are not disabled. If set, check triton_model_repository_dir_remote and triton_server_params_remote as well.
triton_model_repository_dir_remote
¶
triton_model_repository_dir_remote (String) (Expert Setting)
Default value ''
Path to model repository directory for remote Triton inference server outside of Driverless AI. All Triton deployments for all users are stored in this directory. Requires write access to this directory from Driverless AI (shared file system). This setting is optional. If not provided, will upload each model deployment over gRPC protocol.
triton_server_params_remote
¶
Remote Triton server parameters, used to connect via tritonclient (Dict) (Expert Setting)
Default value {'http-port': 8000, 'grpc-port': 8001, 'metrics-port': 8002}
- Parameters to connect to remote Triton server, only used if triton_host_remote and
triton_model_repository_dir_remote are set. .