dai-tutorial.Rmd
This vignette describes how to use the dai
package to use and control the Driverless AI platform. It covers the main predictive data-science workflow, i.e.:
Before we can start working with the Driverless AI platform, we have to import the package and initialize the connection:
library(dai)
dai.connect(uri = 'http://localhost:12345', username = 'h2oai', password = 'h2oai')
After the connection has been established, you can create a new dataset:
creditcard <- dai.create_dataset('/home/vaclav/Projects/h2oai/tests/smalldata/kaggle/CreditCard/creditcard_train_cat.csv', progress = FALSE)
You can switch off the progress bar whenever displayed by a function of the package by setting progress = FALSE
. The progress bars can also be disabled altogether by setting the option dai.progress
:
options('dai.progress' = FALSE)
The function dai.create_dataset
loads the data located at the machine that hosts Driverless AI. If you wish to upload the data located at your workstation, use dai.upload_dataset
instead. If you already have the data loaded into R data.frame, you can simply convert it into DAIFrame this way:
iris_dai <- as.DAIFrame(iris)
print(iris_dai)
#> DAIFrame '342cde55-9441-11ea-8849-ac1f6b46eb80': 150 obs. of 5 variables
#> File path: ./tmp/342cde55-9441-11ea-8849-ac1f6b46eb80/iris2eda4cbf877c.csv.1589281837.0971727.bin
Upon creation of the dataset, you can display the basic information and summary statistics by calling generics print
and summary
:
print(creditcard)
#> DAIFrame '342cde54-9441-11ea-8849-ac1f6b46eb80': 23999 obs. of 25 variables
#> File path: /home/vaclav/Projects/h2oai/tests/smalldata/kaggle/CreditCard/creditcard_train_cat.csv
summary(creditcard)
#> ID LIMIT_BAL AGE
#> Min. : 1 Min. : 10000 Min. : 21
#> Mean : 12000 Mean :165498.7157798 Mean : 35.3808492
#> St.dev.: 6928.0588912 St.dev.:129130.7430653 St.dev.: 9.2710457
#> Max. : 23999 Max. : 1000000 Max. : 79
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 23999 Unique : 79 Unique : 55
#> PAY_1 PAY_2 PAY_3
#> Min. : -2 Min. : -2 Min. : -2
#> Mean : -0.0031251 Mean : -0.1234635 Mean : -0.1547564
#> St.dev.: 1.1234487 St.dev.: 1.2005912 St.dev.: 1.204058
#> Max. : 8 Max. : 8 Max. : 8
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 11 Unique : 11 Unique : 11
#> PAY_4 PAY_5 PAY_6
#> Min. : -2 Min. : -2 Min. : -2
#> Mean : -0.2116755 Mean : -0.2528855 Mean : -0.2780116
#> St.dev.: 1.1665728 St.dev.: 1.1370067 St.dev.: 1.1581916
#> Max. : 8 Max. : 8 Max. : 8
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 11 Unique : 10 Unique : 10
#> BILL_AMT1 BILL_AMT2 BILL_AMT3
#> Min. : -165580 Min. : -69777 Min. : -157264
#> Mean :50598.9286637 Mean :48648.0474186 Mean :46368.9035376
#> St.dev.:72650.1978093 St.dev.:70365.3956427 St.dev.:68194.7195203
#> Max. : 964511 Max. : 983931 Max. : 1664089
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 18717 Unique : 18367 Unique : 18131
#> BILL_AMT4 BILL_AMT5 BILL_AMT6
#> Min. : -170000 Min. : -81334 Min. : -339603
#> Mean : 42369.872828 Mean :40002.3330972 Mean :38565.2666361
#> St.dev.:63071.4551671 St.dev.:60345.7282797 St.dev.:59156.5011435
#> Max. : 891586 Max. : 927171 Max. : 961664
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 17719 Unique : 17284 Unique : 16906
#> PAY_AMT1 PAY_AMT2 PAY_AMT3
#> Min. : 0 Min. : 0 Min. : 0
#> Mean : 5543.0980458 Mean : 5815.528522 Mean : 4969.431393
#> St.dev.:15068.8627296 St.dev.:20797.4438849 St.dev.:16095.9292948
#> Max. : 505000 Max. : 1684259 Max. : 896040
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 6918 Unique : 6839 Unique : 6424
#> PAY_AMT4 PAY_AMT5 PAY_AMT6
#> Min. : 0 Min. : 0 Min. : 0
#> Mean : 4743.6568607 Mean : 4783.6436935 Mean : 5189.5736072
#> St.dev.: 14883.554872 St.dev.:15270.7039035 St.dev.:17630.7185745
#> Max. : 497000 Max. : 417990 Max. : 528666
#> Count : 23999 Count : 23999 Count : 23999
#> Unique : 6028 Unique : 5984 Unique : 5988
#> DEFAULT_PAYMENT_NEXT_MONTH SEX EDUCATION
#> Min. : FALSE Count : 23999 Count : 23999
#> Mean : 0.2237177 Unique : 2 Unique : 4
#> St.dev.: 0.4167437 Top :female Top :university
#> Max. : TRUE Freq. : 8921 Freq. : 11360
#> Count : 23999
#> Unique : 2
#> MARRIAGE
#> Count : 23999
#> Unique : 4
#> Top :single
#> Freq. : 12876
#>
#>
A couple of other generics work as usual on a DAIFrame: dim
, head
, or format
.
dim(creditcard)
#> [1] 23999 25
head(creditcard)
#> ID LIMIT_BAL SEX EDUCATION MARRIAGE AGE PAY_1 PAY_2 PAY_3 PAY_4
#> 1 1 20000 female university married 24 2 2 -1 -1
#> 2 2 120000 female university single 26 -1 2 0 0
#> 3 3 90000 female university single 34 0 0 0 0
#> 4 4 50000 female university married 37 0 0 0 0
#> 5 5 50000 male university married 57 -1 0 -1 0
#> 6 6 50000 male graduate single 37 0 0 0 0
#> PAY_5 PAY_6 BILL_AMT1 BILL_AMT2 BILL_AMT3 BILL_AMT4 BILL_AMT5 BILL_AMT6
#> 1 -2 -2 3913 3102 689 0 0 0
#> 2 0 2 2682 1725 2682 3272 3455 3261
#> 3 0 0 29239 14027 13559 14331 14948 15549
#> 4 0 0 46990 48233 49291 28314 28959 29547
#> 5 0 0 8617 5670 35835 20940 19146 19131
#> 6 0 0 64400 57069 57608 19394 19619 20024
#> PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6
#> 1 0 689 0 0 0 0
#> 2 0 1000 1000 1000 0 2000
#> 3 1518 1500 1000 1000 1000 5000
#> 4 2000 2019 1200 1100 1069 1000
#> 5 2000 36681 10000 9000 689 679
#> 6 2500 1815 657 1000 1000 800
#> DEFAULT_PAYMENT_NEXT_MONTH
#> 1 True
#> 2 True
#> 3 False
#> 4 False
#> 5 False
#> 6 False
A dataset can be split into e.g. training and test sets directly in R:
splits <- dai.split_dataset(creditcard,
output_name1 = 'train',
output_name2 = 'test',
ratio = .8,
seed = 25,
progress = FALSE)
In this case the splits
is a list with two elements with names ‘train’ and ‘test’, where 80% of the data went into train and 20% into test.
splits$train
#> DAIFrame '3594a255-9441-11ea-8849-ac1f6b46eb80': 19199 obs. of 25 variables
#> File path: ./tmp/3594a255-9441-11ea-8849-ac1f6b46eb80/train.1589281838.27705.bin
splits$test
#> DAIFrame '3594a256-9441-11ea-8849-ac1f6b46eb80': 4800 obs. of 25 variables
#> File path: ./tmp/3594a256-9441-11ea-8849-ac1f6b46eb80/test.1589281838.2886887.bin
By default it yields a simple random sample, but you can do stratified or time-based splits as well. See the function’s documentation for more details.
One of the main strengths of Driverless AI is the fully automated feature engineering along with hyperparameter tuning, model selection and ensambling. The function dai.train
executes the experiment that results in a DAIModel instance representing the model.
model <- dai.train(training_frame = splits$train,
testing_frame = splits$test,
target_col = 'DEFAULT_PAYMENT_NEXT_MONTH',
is_classification = T,
is_timeseries = F,
accuracy = 1, time = 1, interpretability = 10,
seed = 25)
Driverless AI can suggest values for accuracy, time, and interpretability. (See dai.suggest_model_params
.) If you do not specify values for accuracy, time, or interpretability, then Driverless AI will use the recommended values.
As with DAIFrame, generic methods such as print
, format
, summary
, or predict
work with DAIModel:
print(model)
#> h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/h2oai_experiment_summary_36e1a3aa-9441-11ea-8849-ac1f6b46eb80.zip
summary(model)$score
#> [1] 0.6939041
summary(model)$score_f_name
#> [1] "AUC"
New data can be scored in two different ways:
predict
directly on the model in R session; orGeneric predict
either directly returns an R data.frame with the results (by default) or it returns a name of the file containing the predictions on the Driverless AI server (return_df=FALSE
). The latter option may be useful when you predict on a large dataset.
predictions <- predict(model, newdata = splits$test)
head(predictions)
#> DEFAULT_PAYMENT_NEXT_MONTH.0 DEFAULT_PAYMENT_NEXT_MONTH.1
#> 1 0.8474462 0.1525538
#> 2 0.8474462 0.1525538
#> 3 0.8545116 0.1454884
#> 4 0.1951082 0.8048918
#> 5 0.8474462 0.1525538
#> 6 0.8545116 0.1454884
preds_path <- predict(model, newdata = splits$test, return_df = FALSE)
print(preds_path)
#> [1] "h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/36e1a3aa-9441-11ea-8849-ac1f6b46eb80_preds_3913aa9a.csv"
You can later download the file to your workstation:
dai.download_file(file_path = preds_path, dest_path = file.path(tempdir(), 'predictions.csv'), progress = FALSE)
#> [1] "/tmp/RtmpCS2wom/predictions.csv"
For productizing your model in Python or Java, you can download full Python or MOJO pipelines, respectively. For more information about how to use the pipelines please see the documentation.
dai.download_mojo(model, path = tempdir(), force = TRUE)
#> [1] "/tmp/RtmpCS2wom/mojo.zip"
dai.download_python_pipeline(model, path = tempdir(), force = TRUE)
#> [1] "/tmp/RtmpCS2wom/scorer.zip"
After some time, you may have multiple datasets and models on your Driverless AI server. The dai
package offers a few utility functions to find, reuse, and remove the existing datasets and models.
If you already have the dataset loaded into Driverless AI, you can get the DAIFrame object by either dai.get_frame
(if you know the frame’s key) or dai.find_dataset
(if you know the original path or at least a part of it):
dai.get_frame(creditcard$key)
#> DAIFrame '342cde54-9441-11ea-8849-ac1f6b46eb80': 23999 obs. of 25 variables
#> File path: /home/vaclav/Projects/h2oai/tests/smalldata/kaggle/CreditCard/creditcard_train_cat.csv
dai.find_dataset('creditcard')
#> DAIFrame '342cde54-9441-11ea-8849-ac1f6b46eb80': 23999 obs. of 25 variables
#> File path: /home/vaclav/Projects/h2oai/tests/smalldata/kaggle/CreditCard/creditcard_train_cat.csv
The latter directly returns the frame if there’s only one match. Otherwise it lets you select which frame to return from all the matching candidates.
Furthermore, you can get a list of datasets or models:
datasets <- dai.list_datasets()
head(datasets)
#> key name
#> 1 3594a256-9441-11ea-8849-ac1f6b46eb80 test
#> 2 3594a255-9441-11ea-8849-ac1f6b46eb80 train
#> 3 342cde55-9441-11ea-8849-ac1f6b46eb80 iris2eda4cbf877c.csv
#> 4 342cde54-9441-11ea-8849-ac1f6b46eb80 creditcard_train_cat.csv
#> file_path
#> 1 ./tmp/3594a256-9441-11ea-8849-ac1f6b46eb80/test.1589281838.2886887.bin
#> 2 ./tmp/3594a255-9441-11ea-8849-ac1f6b46eb80/train.1589281838.27705.bin
#> 3 ./tmp/342cde55-9441-11ea-8849-ac1f6b46eb80/iris2eda4cbf877c.csv.1589281837.0971727.bin
#> 4 /home/vaclav/Projects/h2oai/tests/smalldata/kaggle/CreditCard/creditcard_train_cat.csv
#> file_size data_source row_count column_count import_status import_error
#> 1 567584 split 4800 25 0
#> 2 2265952 split 19199 25 0
#> 3 7064 upload 150 5 0
#> 4 2832040 file 23999 25 0
#> aggregation_status aggregation_error aggregated_frame mapping_frame
#> 1 -1
#> 2 -1
#> 3 -1
#> 4 -1
#> uploaded
#> 1 TRUE
#> 2 TRUE
#> 3 TRUE
#> 4 FALSE
models <- dai.list_models()
head(models)
#> key description
#> 1 36e1a3aa-9441-11ea-8849-ac1f6b46eb80 pucifepi
#> parameters.dataset.key parameters.dataset.display_name
#> 1 3594a255-9441-11ea-8849-ac1f6b46eb80 train
#> parameters.resumed_model.key parameters.resumed_model.display_name
#> 1
#> parameters.target_col parameters.weight_col parameters.fold_col
#> 1 DEFAULT_PAYMENT_NEXT_MONTH
#> parameters.orig_time_col parameters.time_col
#> 1 [OFF]
#> parameters.is_classification parameters.cols_to_drop
#> 1 TRUE NULL
#> parameters.validset.key parameters.validset.display_name
#> 1
#> parameters.testset.key parameters.testset.display_name
#> 1 3594a256-9441-11ea-8849-ac1f6b46eb80 test
#> parameters.enable_gpus parameters.seed parameters.accuracy
#> 1 TRUE 25 1
#> parameters.time parameters.interpretability parameters.score_f_name
#> 1 1 10 AUC
#> parameters.time_groups_columns
#> 1 NULL
#> parameters.unavailable_columns_at_prediction_time
#> 1 NULL
#> parameters.time_period_in_seconds parameters.num_prediction_periods
#> 1 NA NA
#> parameters.num_gap_periods parameters.is_timeseries
#> 1 NA FALSE
#> parameters.cols_imputation
#> 1 NULL
#> parameters.config_overrides
#> 1 allow_config_overrides_in_expert_page = true\nuser_config_directory = ""\nvis_server_ip = "127.0.0.1"\nvis_server_port = 12346\nprocsy_ip = "127.0.0.1"\nprocsy_port = 12347\nh2o_ip = "127.0.0.1"\nh2o_port = 12348\nenable_h2o_recipes = true\nh2o_recipes_url = "None"\nh2o_recipes_ip = "None"\nh2o_recipes_port = 50341\nh2o_recipes_name = "None"\nh2o_recipes_nthreads = 4\nh2o_recipes_log_level = "None"\nh2o_recipes_max_mem_size = "None"\nh2o_recipes_min_mem_size = "None"\nip = "127.0.0.1"\nport = 12345\nport_range = []\nmax_file_upload_size = 104857600000\nlog_level = 1\ncollect_server_logs_in_experiment_logs = false\nredis_ip = "127.0.0.1"\nredis_port = 6379\nenable_https = false\nssl_key_file = "/etc/dai/private_key.pem"\nssl_crt_file = "/etc/dai/cert.pem"\nssl_no_sslv2 = true\nssl_no_sslv3 = true\nssl_no_tlsv1 = true\nssl_no_tlsv1_1 = true\nssl_no_tlsv1_2 = false\nssl_no_tlsv1_3 = false\nssl_client_verify_mode = "CERT_NONE"\nssl_ca_file = ""\nssl_client_key_file = ""\nssl_client_crt_file = ""\ndata_directory = "./tmp"\ndata_upload_directory = "uploads"\nrecipes_temporary_data_directory = "recipe_tmp"\nenable_quick_benchmark = true\nenable_extended_benchmark = false\nextended_benchmark_scale_num_rows = 0.1\nenable_startup_checks = true\nusage_stats_opt_in = false\nauthentication_method = "unvalidated"\nauthentication_default_timeout_hours = 72\nauth_openid_provider_base_uri = ""\nauth_openid_configuration_uri = ""\nauth_openid_auth_uri = ""\nauth_openid_token_uri = ""\nauth_openid_userinfo_uri = ""\nauth_openid_logout_uri = ""\nauth_openid_redirect_uri = ""\nauth_openid_grant_type = ""\nauth_openid_response_type = ""\nauth_openid_scope = ""\nauth_openid_urlencode_quote_via = "quote"\nauth_openid_access_token_expiry_key = "expires_in"\nauth_openid_refresh_token_expiry_key = "refresh_expires_in"\nauth_openid_token_expiration_secs = 3600\nauth_openid_use_objectpath_match = false\nauth_openid_use_objectpath_expression = ""\nldap_server = ""\nldap_port = ""\nldap_bind_dn = ""\nldap_tls_file = ""\nldap_use_ssl = false\nldap_search_base = ""\nldap_search_filter = ""\nldap_search_attributes = ""\nldap_user_name_attribute = ""\nldap_recipe = "0"\nldap_user_prefix = ""\nldap_search_user_id = ""\nldap_ou_dn = ""\nldap_dc = ""\nldap_base_dn = ""\nldap_base_filter = ""\nauth_tls_crl_file = ""\nauth_tls_subject_field = "CN"\nauth_tls_field_parse_regexp = "(?P<username>.*)"\nauth_tls_user_lookup = "REGEXP_ONLY"\nauth_tls_ldap_authorization_lookup_filter = ""\nauth_tls_ldap_authorization_search_base = ""\nsupported_file_types = [ "csv", "tsv", "txt", "dat", "tgz", "gz", "bz2", "zip", "xz", "xls", "xlsx", "jay", "feather", "bin", "arff", "parquet", "pkl", "orc",]\nrecipe_supported_file_types = [ "py", "pyc",]\nlist_files_without_extensions = false\nenabled_file_systems = [ "upload", "file", "hdfs", "s3", "recipe_file", "recipe_url",]\nmax_files_listed = 100\nmaster_minio_address = "<URL>:<PORT>"\nallow_localstorage = true\nallow_orig_cols_in_predictions = true\nmax_runtime_minutes = 1440\nmax_runtime_minutes_until_abort = 10080\nrecipe = "auto"\nkaggle_timeout = 120\nkaggle_keep_submission = false\nkaggle_competitions = ""\nfeature_engineering_effort = 5\ncheck_distribution_shift = "auto"\ncheck_distribution_shift_drop = "auto"\ndrop_features_distribution_shift_threshold_auc = 0.999\ncheck_leakage = "auto"\ndrop_features_leakage_threshold_auc = 0.999\nleakage_max_data_size = 10000000\nmake_python_scoring_pipeline = "auto"\nmake_mojo_scoring_pipeline = "auto"\nreduce_mojo_size = false\nbenchmark_mojo_latency = "auto"\nbenchmark_mojo_latency_auto_size_limit = 100\nmojo_building_timeout = 1800.0\nmojo_building_parallelism = -1\nmake_pipeline_visualization = "auto"\nmake_autoreport = true\nmax_cores = 0\nmax_cores_dai = -1\nstall_subprocess_submission_dai_fork_threshold_count = 0\nstall_subprocess_submission_mem_threshold_pct = 2\nmax_cores_by_physical = true\nmax_cores_limit = 100\nmax_fit_cores = 10\nmax_predict_cores = 0\nmax_predict_cores_in_dai = 4\nbatch_cpu_tuning_max_workers = 0\ncpu_max_workers = 0\nnum_gpus_per_experiment = -1\nmin_num_cores_per_gpu = 2\nnum_gpus_per_model = 1\nnum_gpus_for_prediction = 0\nassumed_simultaneous_dt_forks_munging = 3\nassumed_simultaneous_dt_forks_stats_openblas = 3\nmax_max_dt_threads_munging = 4\nmax_max_dt_threads_stats_openblas = 4\nmax_max_dt_threads_readwrite = 4\nmin_dt_threads_munging = 1\nmin_dt_threads_final_munging = 1\nmax_dt_threads_munging = -1\nmax_dt_threads_readwrite = -1\nmax_dt_threads_stats_openblas = -1\nmax_dt_threads_do_timeseries_split_suggestion = 1\nmunging_report_period = 10\nworking_munging_report_period = 120\ntraining_report_period = 10\nworking_training_report_period = 60\ngpu_id_start = 0\nmax_workers = 1\nping_period = 60\nping_sleep_period = 1\ndisk_limit_gb = 5\nstall_disk_limit_gb = 1\nmemory_limit_gb = 5\nmin_num_rows = 100\nmin_rows_per_class = 5\nmin_rows_per_split = 5\ndata_precision = "float32"\ntransformer_precision = "float32"\nulimit_up_to_hard_limit = true\ndisable_core_files = false\nlimit_nofile = 65535\nlimit_nproc = 16384\nreproducibility_level = 1\nseed = 1234\nmissing_values = [ "", "?", "None", "nan", "NA", "N/A", "unknown", "inf", "-inf", "1.7976931348623157e+308", "-1.7976931348623157e+308",]\ntf_nan_impute_value = -5\nstatistical_threshold_data_size_small = 100000\nstatistical_threshold_data_size_large = 500000000\naux_threshold_data_size_large = 10000000\nperformance_threshold_data_size_small = 100000\nperformance_threshold_data_size_large = 100000000\nmax_cols = 10000\nmax_rows_col_stats = 1000000\norig_features_fs_report = false\nmax_rows_fs = 1000000\nmax_workers_fs = 0\nmax_orig_cols_selected = 10000\nmax_orig_numeric_cols_selected = 10000\nmax_orig_nonnumeric_cols_selected = 300\nmax_orig_cols_selected_simple_factor = 2\nfs_orig_cols_selected = 500\nfs_orig_numeric_cols_selected = 500\nfs_orig_nonnumeric_cols_selected = 200\nfs_orig_cols_selected_simple_factor = 2\nmax_relative_cardinality = 0.95\nmax_absolute_cardinality = 1000000\nnum_as_cat = true\nmax_int_as_cat_uniques = 50\nnum_folds = 3\nallow_different_classes_across_fold_splits = true\nfull_cv_accuracy_switch = 8\nensemble_accuracy_switch = 5\nnum_ensemble_folds = 5\nfold_reps = 1\nmax_num_classes_hard_limit = 10000\nmax_num_classes = 200\nmax_num_classes_compute_roc = 200\nmax_num_classes_client_and_gui = 10\nroc_reduce_type = "rows"\nmin_roc_sample_size = 1\nnum_actuals_vs_predicted = 100\nfeature_brain_level = 0\nbrain_maximum_diff_score = 0.1\nmax_num_brain_indivs = 3\nfeature_brain_save_every_iteration = 0\nwhich_iteration_brain = -1\nrefit_same_best_individual = false\nbrain_rel_dir = "H2O.ai_brain"\nbrain_max_size_GB = 20\nbrain_add_features_for_new_columns = true\nforce_model_restart_to_defaults = true\nearly_stopping = true\nearly_stopping_per_individual = true\nmin_dai_iterations = 0\nnfeatures_max = -1\nngenes_max = -1\nlimit_features_by_interpretability = true\ntensorflow_max_epochs_nlp = 1\nenable_tensorflow_nlp_accuracy_switch = 5\nenable_tensorflow_textcnn = "auto"\nenable_tensorflow_textbigru = "auto"\nenable_tensorflow_charcnn = "auto"\ntensorflow_nlp_pretrained_embeddings_file_path = ""\ntensorflow_nlp_pretrained_embeddings_trainable = false\ntensorflow_nlp_have_gpus_in_production = false\ntext_fraction_for_text_dominated_problem = 0.3\ntext_transformer_fraction_for_text_dominated_problem = 0.3\nstring_col_as_text_threshold = 0.3\nstring_col_as_text_min_relative_cardinality = 0.1\nstring_col_as_text_min_absolute_cardinality = 100\nmonotonicity_constraints_interpretability_switch = 7\nmonotonicity_constraints_correlation_threshold = 0.1\ndefault_max_feature_interaction_depth = 8\nmax_feature_interaction_depth = -1\nfixed_feature_interaction_depth = 0\nfixed_feature_interaction_depth_numcat_cat = 0\nfixed_feature_interaction_depth_many_inputs_generates_not_useful = 0\ntune_parameters_accuracy_switch = 3\ntune_target_transform_accuracy_switch = 3\ntarget_transformer = "auto"\ntournament_style = "auto"\ntournament_uniform_style_interpretability_switch = 6\ntournament_uniform_style_accuracy_switch = 6\ntournament_model_style_accuracy_switch = 6\ntournament_feature_style_accuracy_switch = 7\ntournament_fullstack_style_accuracy_switch = 8\ntournament_use_feature_penalized_score = true\nnum_individuals = 2\nfixed_num_individuals = 0\nfixed_fold_reps = 0\nsanitize_natural_sort_limit = 1000\nenable_target_encoding = "auto"\nenable_lexilabel_encoding = "off"\nenable_isolation_forest = "off"\nenable_one_hot_encoding = "auto"\nisolation_forest_nestimators = 200\nincluded_transformers = [ "CVCatNumEncodeTransformer", "CVTargetEncodeTransformer", "CatOriginalTransformer", "CatTransformer", "ClusterDistTransformer", "ClusterTETransformer", "DateOriginalTransformer", "DateTimeOriginalTransformer", "DatesTransformer", "EwmaLagsTransformer", "FrequentTransformer", "InteractionsTransformer", "IsHolidayTransformer", "IsolationForestAnomalyNumCatAllColsTransformer", "IsolationForestAnomalyNumCatTransformer", "IsolationForestAnomalyNumericTransformer", "LagsAggregatesTransformer", "LagsInteractionTransformer", "LagsTransformer", "LexiLabelEncoderTransformer", "NumCatTETransformer", "NumToCatTETransformer", "NumToCatWoEMonotonicTransformer", "NumToCatWoETransformer", "OneHotEncodingTransformer", "OriginalTransformer", "RawTransformer", "TextBiGRUTransformer", "TextCNNTransformer", "TextCharCNNTransformer", "TextLinModelTransformer", "TextTransformer", "TruncSVDNumTransformer", "WeightOfEvidenceTransformer",]\nexcluded_transformers = []\nincluded_genes = []\nexcluded_genes = []\nincluded_models = [ "CONSTANT", "DECISIONTREE", "FTRL", "GLM", "IMBALANCEDLIGHTGBM", "IMBALANCEDXGBOOSTGBM", "LIGHTGBM", "RULEFIT", "TENSORFLOW", "XGBOOSTDART", "XGBOOSTGBM",]\nexcluded_models = []\nincluded_scorers = [ "ACCURACY", "AUC", "AUCPR", "F05", "F1", "F2", "GINI", "LOGLOSS", "MACROAUC", "MAE", "MAPE", "MCC", "MER", "MSE", "R2", "RMSE", "RMSLE", "RMSPE", "SMAPE",]\nexcluded_scorers = []\nenable_xgboost_gbm = "auto"\nenable_xgboost_dart = "auto"\nxgboost_threshold_data_size_large = 100000000\nxgboost_gpu_threshold_data_size_large = 30000000\nenable_glm = "auto"\nenable_decision_tree = "auto"\nenable_lightgbm = "auto"\nenable_tensorflow = "auto"\nenable_ftrl = "auto"\nenable_rulefit = "auto"\nenable_lightgbm_boosting_types = [ "gbdt",]\nenable_lightgbm_cat_support = false\nenable_constant_model = "auto"\nshow_constant_model = false\ndrop_constant_model_final_ensemble = true\nparams_tune_grow_policy_simple_trees = true\nmax_nestimators = 3000\nn_estimators_list_no_early_stopping = [ 50, 100, 200, 300,]\nmin_learning_rate_final = 0.01\nmax_learning_rate_final = 0.05\nmax_nestimators_feature_evolution_factor = 0.2\nmin_learning_rate = 0.05\nmax_learning_rate = 0.5\nmax_epochs = 1\nmax_max_depth = 12\ndefault_max_bin = 256\ndefault_lightgbm_max_bin = 64\nmax_max_bin = 256\nmin_max_bin = 32\nscale_mem_for_max_bin = 10737418240\nfactor_rf = 1.25\ntensorflow_use_all_cores = true\ntensorflow_use_all_cores_even_if_reproducible_true = false\ntensorflow_cores = 0\nrulefit_max_num_rules = -1\nrulefit_max_tree_depth = 6\nrulefit_max_num_trees = 100\nrulefit_threshold_data_size_large = 100000000\none_hot_encoding_cardinality_threshold = 50\none_hot_encoding_cardinality_limiter = true\nfixed_ensemble_level = -1\ncross_validate_single_final_model = true\nparameter_tuning_num_models = -1\nvalidate_meta_learner = true\nvalidate_meta_learner_extra = false\nfixed_num_folds_evolution = -1\nfixed_num_folds = -1\nfixed_only_first_fold_model = "auto"\nnum_fold_ids_show = 10\nfold_scores_instability_warning_threshold = 0.25\nfeature_evolution_data_size = 100000000\nfinal_pipeline_data_size = 500000000\nmax_validation_to_training_size_ratio_for_final_ensemble = 2.0\nforce_stratified_splits_for_imbalanced_threshold_binary = 0.01\nimbalance_sampling_method = "off"\nimbalance_ratio_sampling_threshold = 5\nheavy_imbalance_ratio_sampling_threshold = 25\nimbalance_sampling_number_of_bags = -1\nimbalance_sampling_max_number_of_bags = 10\nimbalance_sampling_max_number_of_bags_feature_evolution = 3\nimbalance_sampling_max_multiple_data_size = 1.0\nimbalance_sampling_rank_averaging = "auto"\nimbalance_sampling_target_minority_fraction = -1.0\nimbalance_ratio_notification_threshold = 2.0\nnbins_ftrl_list = [ 1000000, 10000000, 100000000,]\nftrl_max_interaction_terms_per_degree = 10000\nte_bin_list = [ 25, 10, 100, 250,]\nwoe_bin_list = [ 25, 10, 100, 250,]\nohe_bin_list = [ 10, 25, 50, 75, 100,]\ndrop_constant_columns = true\ndrop_id_columns = true\nno_drop_features = false\ncols_to_drop = ""\ncols_to_group_by = ""\nsample_cols_to_group_by = false\nagg_funcs_for_group_by = [ "mean", "sd", "min", "max", "count",]\nfolds_for_group_by = 5\nmutation_mode = "sample"\nshift_check_text = false\nuse_rf_for_shift_if_have_lgbm = true\nshift_key_features_varimp = 0.01\nshift_check_reduced_features = true\nshift_trees = 100\nshift_max_bin = 256\nshift_min_max_depth = 4\nshift_max_max_depth = 8\ndetect_features_distribution_shift_threshold_auc = 0.55\ndrop_features_distribution_shift_min_features = 1\nleakage_check_text = true\nleakage_key_features_varimp = 0.001\nleakage_key_features_varimp_if_no_early_stopping = 0.05\nleakage_check_reduced_features = true\nuse_rf_for_leakage_if_have_lgbm = true\nleakage_trees = 100\nleakage_max_bin = 256\nleakage_min_max_depth = 4\nleakage_max_max_depth = 8\ndetect_features_leakage_threshold_auc = 0.95\ndetect_features_per_feature_leakage_threshold_auc = 0.8\ndrop_features_leakage_min_features = 1\nleakage_train_test_split = 0.25\ndetailed_traces = false\ndebug_log = false\nlog_system_info_per_experiment = true\nmax_debug_description_length = 1000\nabs_tol_for_perfect_score = 0.0001\ndata_ingest_timeout = 86400.0\ntime_series_recipe = true\ntime_series_validation_fold_split_datetime_boundaries = ""\ntimeseries_split_suggestion_timeout = 30.0\nuse_lags_if_not_time_series_recipe = false\nmin_ymd_timestamp = 19000101\nmax_ymd_timestamp = 21000101\nmax_rows_datetime_format_detection = 100000\nholiday_features = true\nholiday_country = "US"\nmax_time_series_properties_sample_size = 250000\nmax_lag_sizes = 30\nmin_lag_autocorrelation = 0.1\nmax_signal_lag_sizes = 100\nsample_lag_sizes = false\nmax_sampled_lag_sizes = 10\noverride_lag_sizes = []\nmin_lag_size = -1\nallow_time_column_as_feature = true\nallow_time_column_as_numeric_feature = false\ndatetime_funcs = [ "year", "quarter", "month", "week", "weekday", "day", "dayofyear", "hour", "minute", "second",]\nallow_tgc_as_features = false\nallowed_coltypes_for_tgc_as_features = [ "numeric", "categorical", "ohe_categorical", "datetime", "date", "text",]\nenable_time_unaware_transformers = "auto"\ntgc_only_use_all_groups = true\ntime_series_holdout_preds = true\ntime_series_validation_splits = -1\ntime_series_splits_max_overlap = 0.5\ntime_series_max_holdout_splits = -1\nsingle_model_vs_cv_score_reldiff = 0.05\nsingle_model_vs_cv_score_reldiff2 = 0.0\nmli_ts_fast_approx = false\nmli_ts_fast_approx_contribs = true\nmli_ts_holdout_contribs = true\ntime_series_min_interpretability = 5\nlags_dropout = "dependent"\nprob_lag_non_targets = 0.1\nrolling_test_method = "tta"\nprob_default_lags = 0.2\nprob_lagsinteraction = 0.2\nprob_lagsaggregates = 0.2\ntgc_via_ui_max_ncols = 10\ntgc_dup_tolerance = 0.01\nmli_use_mojo_pipeline = false\nmli_sample_above_for_scoring = 1000000\nmli_sample_above_for_training = 100000\nmli_sample_size = 100000\nmli_num_quantiles = 10\nmli_drf_num_trees = 100\nmli_fast_approx = true\nfast_approx_num_trees = 50\nfast_approx_do_one_fold_one_model = true\nmli_drf_max_depth = 20\nmli_sample_training = true\nklime_lambda = [ 1e-6, 1e-8,]\nklime_alpha = 0.0\nmli_max_numeric_enum_cardinality = 25\nmli_max_number_cluster_vars = 6\nuse_all_columns_klime_kmeans = false\nmli_strict_version_check = true\nmli_cloud_name = "H2O-MLI-DAI-1.8.7+local_181440d-32135"\nmli_ice_per_bin_strategy = false\nmli_dia_default_max_cardinality = 10\nenable_mli_sa = true\nmli_dia_sample_size = 100000\nmli_pd_sample_size = 100000\nmli_pd_numcat_num_chart = false\nmli_pd_numcat_threshold = 10\nmli_shapley_sample_size = 100000\nenable_mli_async_api = true\nenable_mli_sa_main_chart_aggregator = true\nmli_sa_sampling_limit = 500000\nmli_sa_main_chart_aggregator_limit = 1000\nmli_predict_safe = false\nenable_mli_predict_fast_approx = false\nmli_max_surrogate_retries = 5\nmli_mojo_batch_size = 50\nmli_nlp_tokenizer = "tfidf"\nmli_nlp_top_n = 20\nmli_nlp_sample_limit = 10000\nmli_nlp_workers = 4\nmli_nlp_min_df = 3\nmli_nlp_max_df = 0.9\nmli_nlp_min_ngram = 1\nmli_nlp_max_ngram = 1\nmli_nlp_min_token_mode = "top"\nmli_nlp_tokenizer_max_features = -1\nmli_nlp_loco_max_features = -1\nmli_nlp_surrogate_tokens = 100\nmli_batch_size = 10000\nmli_dt_nthreads = 4\ndump_varimp_every_scored_indiv = false\ndump_modelparams_every_scored_indiv = true\ndump_modelparams_every_scored_indiv_feature_count = 3\ndump_modelparams_separate_files = false\ndump_trans_timings = false\ndelete_preview_trans_timings = true\nautodoc_template = "report_template.docx"\nautodoc_output_type = "docx"\nautodoc_report_name = "report"\nautodoc_max_cm_size = 10\nautodoc_coef_table_num_models = 1\nautodoc_coef_table_num_folds = -1\nautodoc_coef_table_num_coef = 50\nautodoc_coef_table_num_classes = 9\nautodoc_coef_table_appendix_results_table = false\nautodoc_min_relative_importance = 0.003\nautodoc_num_features = 50\nautodoc_num_rows = 0\nautodoc_pd_max_runtime = 20\nautodoc_out_of_range = 3\nautodoc_data_summary_col_num = -1\nautodoc_prediction_stats = false\nautodoc_prediction_stats_n_quantiles = 20\nautodoc_population_stability_index = false\nautodoc_population_stability_index_n_quantiles = 10\nautodoc_include_permutation_feature_importance = false\nautodoc_feature_importance_scorer = ""\nautodoc_feature_importance_num_perm = 1\nautodoc_response_rate = false\nautodoc_response_rate_n_quantiles = 10\nautodoc_global_klime_num_features = 10\nautodoc_global_klime_num_tables = 1\nautodoc_gini_plot = false\nautodoc_list_all_config_settings = false\nautoviz_max_num_columns = 50\nautoviz_max_aggregated_rows = 500\ncompute_correlation = false\nproduce_correlation_heatmap = false\nhigh_correlation_value_to_report = 0.95\npreview_cache_upon_server_exit = true\ncore_site_xml_path = ""\nhdfs_config_path = ""\nkey_tab_path = ""\nhdfs_keytab_path = ""\nfile_hide_data_directory = true\nfile_path_filtering_enabled = false\nfile_path_filter_include = []\nhdfs_auth_type = "noauth"\nhdfs_app_principal_user = ""\nhdfs_app_login_user = ""\nhdfs_app_jvm_args = ""\nhdfs_app_classpath = ""\nhdfs_app_supported_schemes = [ "hdfs://", "maprfs://", "swift://",]\nhdfs_max_files_listed = 100\ndtap_auth_type = "noauth"\ndtap_config_path = ""\ndtap_key_tab_path = ""\ndtap_keytab_path = ""\ndtap_app_principal_user = ""\ndtap_app_login_user = ""\ndtap_app_jvm_args = ""\ndtap_app_classpath = ""\naws_role_arn = ""\naws_default_region = ""\naws_s3_endpoint_url = ""\naws_use_ec2_role_credentials = false\ns3_init_path = "s3://"\nkdb_hostname = ""\nkdb_port = ""\nkdb_app_classpath = ""\nkdb_app_jvm_args = ""\nazure_connection_string = ""\njdbc_app_configs = "{}"\njdbc_app_jvm_args = "-Xmx4g"\njdbc_app_classpath = ""\nhive_app_configs = "{}"\nlisteners_experiment_start = ""\nlisteners_experiment_done = ""\nh2o_storage_address = ""\nh2o_storage_tls_enabled = true\nh2o_storage_tls_ca_path = ""\nh2o_storage_tls_cert_path = ""\nh2o_storage_tls_key_path = ""\nh2o_storage_internal_default_project_id = ""\ndeployment_aws_bucket_name = ""\nallow_form_autocomplete = true\nenable_projects = true\nenable_custom_recipes = false\nenable_custom_recipes_upload = true\ninclude_custom_recipes_by_default = false\napp_language = "en"\ndisablelogout = false\nshow_developer_settings = false\nenable_benchmark_each_experiment = false\ndatatable_verbose_log = false\ndatatable_show_progress = false\ndatatable_allow_interruption = false\ndatatable_fread_anonymize = true\ndatatable_strategy = "auto"\ngpu_locking_type = "global"\ngpu_lock_data_size = 150000000\ngpu_lock_safety_factor = 3\ngpu_lock_delay = 10.0\ngpu_lock_min_delay = 0.0\ntensorflow_allow_cpu_only = false\nfinal_vs_ga_progress_factor = 5\ndebug = false\ndebug_print = false\ndebug_print_level = 0\ndebug_print_server = false\nping_period_debug = 20\nhard_asserts = false\nhard_backtest_asserts = false\ncheck_invalid_config_toml_keys = true\nbrain_inconsistent_asserts = false\ncheck_1_vs_N = false\ncheck_pred_contribs_sum = false\nenable_funnel = true\nclean_funnel = true\nquiet_funnel = false\ndummy = 1\ndebug_ga = false\ndebug_indiv = false\nuniquify_indiv = true\ndebug_daimodel_level = 0\ndebug_pipeline_pickles = false\nbrain_food_full = false\nmemory_ref = 137438953472\nrows_ref = 2000000\nmemory_score_factor_ref = 0.2\nuse_dummy_pool = false\nuse_dummy_pool_onetask = false\nuse_dummy_pool_detect_types = false\nuse_dummy_pool_capture_schema = false\nuse_dummy_pool_fs = false\nuse_dummy_pool_dl2 = false\nuse_dummy_pool_munging = false\nuse_dummy_pool_training = false\nuse_dummy_pool_xgb_fit = false\nuse_dummy_pool_lgbm_fit = false\nuse_dummy_pool_predict = false\nuse_dummy_pool_score = false\nuse_dummy_pool_final_munging = false\nuse_dummy_pool_final_training = false\nuse_dummy_pool_final_predict = false\nuse_dummy_pool_mojo = false\nuse_global_isolation_pool = false\ndatatable_is_leaking = true\nstalled_time_nrows_ref = 1000000\nstalled_time_nrowscols_ref = 10000000\nstalled_time_ref = 240.0\nstalled_time_min = 120.0\nstalled_time_kill_ref = 440.0\nstalled_pool_cpu_percent_threshold_kill = 5.0\nstalled_pool_gpu_percent_threshold_kill = 5.0\ncpu_percent_check_period_stalled = 0.3\ncpu_percent_per_experiment_check_period_logging = 0.1\ncpu_gpu_stall_max_count = 10\ncpu_stall_max_factor_interval = 20\nnum_cpu_sockets = 1\nautodl_stall_sigusr1 = true\nlightgbm_import_stall_trials = 3\nlightgbm_import_stall_timeout = 30\ntask_trials = 2\nreap_trials = 3\nstats_trials = 5\npool_mark = "_"\nacquire_mark = "+"\ntask_sync_mark = "*"\nping_mark = "_ping_"\npool_wait_mark = "|"\npool_submit_mark = "="\npool_busy_mark = "#"\ntask_sync_key_length = 6\nxgb_in_subprocess = true\nenable_preview_time_estimate = true\nenable_preview_time_estimate_rough = true\nshow_inapplicable_models_preview = false\nshow_inapplicable_transformers_preview = false\nxgb_memory_pickled_estimate = false\ndebug_h2o4gpu_level = 0\nenable_h2o4gpu_kmeans = false\nenable_h2o4gpu_truncatedsvd = false\nxgboost_direct_datatable = false\ninteraction_finder_max_rows_x_cols = 200000.0\ninteraction_finder_search_limit = 20\ninteraction_finder_corr_threshold = 0.95\ninteraction_finder_max_pairwise_interactions = 100\ninteraction_finder_gini_rel_improvement_threshold = 0.5\ninteraction_finder_return_limit = 5\nenable_bootstrap = true\nmin_bootstrap_samples = 1\nmax_bootstrap_samples = 100\nmin_bootstrap_sample_size_factor = 1.0\nmax_bootstrap_sample_size_factor = 10.0\nbootstrap_final_seed = -1\nserver_proctitle = "h2oai-DriverlessAI"\nserver_fork_proctitle = "h2oai-DAI-fork"\nserver_db_checkpoint_period = 10\nserver_wal_mode = true\nserver_wal_db_checkpoint_period = 360\nserver_timeout_flush_rpc = 10.0\nserver_timeout_flush_rpc_min = 10.0\nserver_timeout_flush_rpc_max = 120.0\nenable_gpu_usage_in_gui = true\nping_check_period_memory = 5\nping_check_period_usage = 5\nping_check_period_files = 60\nping_check_server_health = true\nping_check_period_server_health = 1.0\nping_check_period_server_exists = 10.0\nping_check_period_server_health_hard_assert_limit = 60.0\nping_check_server_health_debug_duration_always_print = false\nping_check_period_server_health_info_logger = true\nserver_health_sigusr1 = false\nserver_runtime_metrics = false\nping_check_period = 5\ngpu_small_data_size = 100000\nmax_rows_tuning = 1000\nbenford_mad_threshold_int = 0.03\nbenford_mad_threshold_real = 0.1\nstop_early_rel_std = 0.1\nstop_early_abs_std = 0.001\nvarimp_threshold_at_interpretability_10 = 0.05\nlowest_nonzero_varimp = 1e-30\nfeatures_allowed_by_interpretability = "{1: 10000000, 2: 10000, 3: 1000, 4: 500, 5: 300, 6: 200, 7: 150, 8: 100, 9: 80, 10: 50, 11: 50, 12: 50, 13: 50}"\nnfeatures_max_threshold = 200\nrdelta_percent_score_penalty_per_feature_by_interpretability = "{1: 0.0, 2: 0.1, 3: 1.0, 4: 2.0, 5: 5.0, 6: 10.0, 7: 20.0, 8: 30.0, 9: 50.0, 10: 100.0, 11: 100.0, 12: 100.0, 13: 100.0}"\ndrop_low_meta_weights = true\nmeta_weight_allowed_by_interpretability = "{1: 1E-7, 2: 1E-5, 3: 1E-4, 4: 1E-3, 5: 1E-2, 6: 0.03, 7: 0.05, 8: 0.08, 9: 0.10, 10: 0.15, 11: 0.15, 12: 0.15, 13: 0.15}"\nfeature_cost_mean_interp_for_penalty = 5\nfeatures_cost_per_interp = 0.25\nvarimp_threshold_shift_report = 0.3\napply_featuregene_limits_after_tuning = true\nremove_scored_0gain_genes_in_postprocessing_above_interpretability = 13\nremove_scored_0gain_genes_in_postprocessing_above_interpretability_final_population = 2\nremove_scored_by_threshold_genes_in_postprocessing_above_interpretability_final_population = 7\nmerge_dup_raw_features = true\nfs_interpretabilty_switch = 6\nfs_prune_by_genes = false\nvarimp_fspermute_factor = 1.0\nngenes_min = -1\nnfeatures_min = -1\nfeatures_per_gene = 1\nnfeatures_max_factor = 1.0\nuse_unit_pool_protection = true\nalways_use_unit_pool_protection = false\ntrigger_subprocess_catch_rows_times_columns = 10000000\nstrict_gpu_non_overlap = false\nround_up_indivs_for_busy_gpus = true\nterminate_train_backend_tuning = true\nterminate_train_tuning = true\nterminate_train_feature_evolution = true\nreduce_by_genes = false\nkeep_fraction_default = 1.0\nreduce_randomly = false\nreduce_by_fraction = 0.1\nreduce_count_max = 10\npreserve_oom_reduced_features = true\npreserve_varimp_reduced_features = false\npreserve_varimp_reduced_features_for_rescored = true\npreserve_reduced_features_for_final_model = true\nmutation_rate = -1\nextra_mutation_level = -1\ntuning_share_varimp_accuracy_switch = 5\ntuning_share_varimp = "best"\nfresh_indiv_accuracy_switch = 7\nenable_tensorflow_import = true\ndetailed_tensorflow_import_error = false\nenable_lightgbm_import = true\ncheck_clinfo_timeout = 30\ncheck_java_timeout = 30\nprob_add_genes = 0.5\nprob_addbest_genes = 0.5\nprob_prune_genes = 0.5\nprob_perturb_xgb = 0.25\nprob_prune_by_features = 0.25\nperturb_xgb_depth_random = false\nperturb_xgb_depth_min = 3\nperturb_xgb_depth_max = 10\nexplore_more_unused_genes = true\nexplore_gene_anneal = true\nexplore_prob0 = 0.5\nexplore_anneal_factor = 0.9\nexplore_prob_lowest = 0.1\nexplore_grow_anneal = true\ngrow_prob0 = 0.8\ngrow_anneal_factor = 0.5\ngrow_prob_lowest = 0.05\ngrow_proboff = 0.5\nprob_tune_model_vs_features = 0.5\nmax_absolute_feature_expansion = 1000\nexplore_model_anneal = true\nexplore_model_prob0 = 0.5\nexplore_model_anneal_factor = 0.9\nexplore_model_prob_lowest = 0.1\nxgboost_interpretability_switch = 10\nxgboost_accuracy_switch = 1\nbooster_for_fs_permute = "auto"\nmodel_class_name_for_fs_permute = "auto"\ndefault_booster = "lightgbm"\ndefault_model_class_name = "LightGBMModel"\nthreshold_data_size_large_to_use_cpu_for_fs = 100000000\nlightgbm_interpretability_switch = 10\nlightgbm_accuracy_switch = 1\ndecision_tree_interpretability_switch = 7\ndecision_tree_accuracy_switch = 7\ntensorflow_interpretability_switch = 6\ntensorflow_accuracy_switch = 5\ntensorflow_num_classes_switch = 10\ntensorflow_num_classes_switch_but_keep_lightgbm = 15\ntextlin_num_classes_switch = 5\ntext_gene_dim_reduction_choices = [ 50,]\ntext_gene_max_ngram = [ 1, 2, 3,]\nglm_do_lambda_search = true\nglm_do_lambda_search_by_eval_metric = false\nglm_lambda_early_stopping_rounds = 4\ngbm_early_stopping_rounds_min = 1\ngbm_early_stopping_rounds_max = 10000000000\nglm_optimal_refit = true\nglm_interpretability_switch = 6\nglm_accuracy_switch = 5\nfixup_nanpreds = true\nfixup_infX = true\nrulefit_interpretability_switch = 1\nrulefit_accuracy_switch = 8\nenable_cache_final_pipeline = false\ntop_pid = -1\nserver_pid = -1\ngolden_fatal = false\nnotification_url = "https://s3.amazonaws.com/ai.h2o.notifications/dai_notifications_prod.json"\nping_run_type = "fork"\nlock_logs = true\nmax_num_varimp_to_log = 10\nmax_num_varimp_shift_to_log = 10\nlock_logs_server = false\ndebug_feature_cache_files = false\ndebug_final_feature_cache_files = false\ngenerate_fresh_logger_every_log = true\nfinal_munging_memory_reduction_factor = 2\nmunging_memory_overhead_factor = 5\nmunging_memory_available = 0\nmunging_max_workers_verbose = false\nprob_segfault_ga_pipeline = 0.0\nprob_segfault_ga_model = 0.0\nprob_segfault_final_pipeline = 0.0\nprob_segfault_final_model = 0.0\nprob_segfault_fit_transform_ga = 0.0\nprob_segfault_transform_ga = 0.0\ndisallow_segfault_final = false\nprob_segfault_fit_transform_final = 0.0\nprob_segfault_transform_final = 0.0\nprob_segfault_lightgbm = 0.0\nprob_segfault_xgboost = 0.0\nprob_segfault_gblinear = 0.0\nprob_segfault_tensorflow = 0.0\nprob_segfault_rulefit = 0.0\nprob_stall_ga_pipeline = 0.0\nprob_stall_fitmodel = 0.0\nprob_stall_final_pipeline = 0.0\nprob_stall_kill_ga_pipeline = 0.0\nprob_stall_kill_fitmodel = 0.0\nprob_stall_kill_final_pipeline = 0.0\nprob_finish_early = 0.0\nprob_abort_early = 0.0\nstall_subprocess_submission_cpu_threshold_pct = 100\nstall_subprocess_submission_dai_fork_threshold_pct = -1.0\nstall_subprocess_submission_experiment_fork_threshold_pct = -1.0\nstall_by_memory = true\nstall_by_cores = true\nstall_by_disk = true\nrestrict_initpool_by_memory = true\nrestrict_initpool_by_cores = false\nbase_fork_count = 2\nexperiment_nice_level = 100\nrandom_control = false\ndill_for_model = false\ngzip_for_model = true\nbzip_for_model = false\ncompression_level_for_model = 5\ncheck_nvidia_smi_during_experiment = false\nsmall_pool_sleeptouse = 0.005\nlarge_pool_sleeptouse = 0.01\nonetask_sleeptouse = 0.01\nonetask_fast_sleeptouse = 0.0001\npool_default_sleeptouse = 0.001\ntrace_detect_types = false\ntrace_fit_transform = false\ntrace_final_fit_transform = false\ntrace_final_transform = false\nterminate_experiment_if_server_lost = false\nshow_all_filesystems = false\nuse_uuids = true\nresumed_experiment_id = ""\nexperiment_id = "36e1a3aa-9441-11ea-8849-ac1f6b46eb80"\nexperiment_tmp_dir = "./tmp/h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80"\nreproducible = true\nfast_import = false\nserver_recipe_url = ""\nnum_rows_acceptance_test_custom_transformer = 200\nnum_rows_acceptance_test_custom_model = 100\nenable_custom_transformers = false\nenable_custom_models = false\nenable_custom_scorers = false\nenable_custom_datas = true\nrecipe_load_raise_on_first_error = false\nrecipe_load_raise_on_first_error_but_keep_accepted_recipes = false\nrecipe_load_raise_on_any_error = true\nversion_old_custom = false\nversion_old_custom_by_filename = false\nwrite_recipes_to_experiment_folder = true\nwrite_recipes_to_experiment_logger = false\ncontrib_relative_directory = "contrib"\ncontrib_env_relative_directory = "contrib/env"\npip_install_overall_retries = 2\npip_install_verbosity = 2\npip_install_timeout = 15\npip_install_retries = 5\npip_install_options = ""\nenable_basic_acceptance_tests = true\nenable_acceptance_tests = true\nacceptance_test_timeout = 20.0\nacceptance_tests_max_number_of_parameter_combinations = 20\ncontrib_reload_and_recheck_server_start = true\nrecipe_test_import_after_load_package = false\ndata_recipe_preview_num_rows = 20\ndebug_custom = false\nallow_any_call_to_skip_failures = true\nskip_transformer_failures = true\nskip_model_failures = true\ndetailed_skip_failure_messages_level = 1\nconfig_overrides = ""\nprotect_base_env = false\ncheck_cuda_context_testing = false\ntest_scoring_simple = true\nenable_dataset_downloading = true\naudit_log_retention_period = 5\nenable_advanced_features_experiment = false\nenable_artifacts_upload = false\nartifacts_store = "file_system"\nartifacts_file_system_directory = "tmp"\nartifacts_s3_bucket = ""\nartifacts_git_user = "git"\nartifacts_git_repo = ""\nartifacts_git_branch = "dev"\nartifacts_git_ssh_private_key_file_location = ""\nenable_imputation = false\nnum_models_for_resume_graph = 1000\nbase_url = "/"\n\n[h2o_recipes_kwargs]\n\n[params_lightgbm]\n\n[params_xgboost]\n\n[params_dart]\n\n[params_tensorflow]\n\n[params_gblinear]\n\n[params_decision_tree]\n\n[params_rulefit]\n\n[params_ftrl]\n\n[params_tune_lightgbm]\n\n[params_tune_xgboost]\n\n[params_tune_dart]\n\n[params_tune_tensorflow]\n\n[params_tune_gblinear]\n\n[params_tune_rulefit]\n\n[params_tune_ftrl]\n\n[recipe_dict]\n\n[extra_http_headers]\n
#> log_file_path
#> 1 h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/h2oai_experiment_logs_36e1a3aa-9441-11ea-8849-ac1f6b46eb80.zip
#> pickle_path
#> 1 h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/best_individual.pickle
#> summary_path
#> 1 h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/h2oai_experiment_summary_36e1a3aa-9441-11ea-8849-ac1f6b46eb80.zip
#> train_predictions_path
#> 1 h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/train_preds.csv
#> valid_predictions_path
#> 1
#> test_predictions_path
#> 1 h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/test_preds.csv
#> progress status training_duration score_f_name score test_score
#> 1 1 0 62.9923 AUC 0.6939041 0.7039458
#> deprecated model_file_size diagnostic_keys
#> 1 FALSE 336770901 NULL
Similarly to dai.get_frame
, you can obtain an instance of DAIModel by dai.get_model
:
dai.get_model(models$key[1])
#> h2oai_experiment_36e1a3aa-9441-11ea-8849-ac1f6b46eb80/h2oai_experiment_summary_36e1a3aa-9441-11ea-8849-ac1f6b46eb80.zip
Finally, the datasets and models can be removed by dai.rm
:
dai.rm(model, creditcard, splits$train, splits$test, iris_dai)
#> Model 36e1a3aa-9441-11ea-8849-ac1f6b46eb80 removed
#> Dataset 342cde54-9441-11ea-8849-ac1f6b46eb80 removed
#> Dataset 3594a255-9441-11ea-8849-ac1f6b46eb80 removed
#> Dataset 3594a256-9441-11ea-8849-ac1f6b46eb80 removed
#> Dataset 342cde55-9441-11ea-8849-ac1f6b46eb80 removed
The function dai.rm
deletes the objects by default both from the server and the R session. If you wish to remove it only from the server, you can set from_session=FALSE
. Please note that only objects can be removed from the session, i.e. in the example above the splits$train
and splits$test
objects will not be removed from R session, because they are actually function calls (recall that $
is a function).