Experiments¶

Experiments ¶

Interact with experiments in the Driverless AI server.

create ¶

create(
    train_dataset: Dataset,
    target_column: str | None,
    task: str,
    force: bool = False,
    name: str = None,
    description: str = None,
    **kwargs: Any
) -> Experiment

Creates an experiment in the Driverless AI server.

Parameters:

train_dataset (Dataset) –

Dataset object.
target_column (str | None) –

Name of the column train_dataset (pass None if task is 'unsupervised').
task (str) –

One of 'regression', 'classification', or 'unsupervised'.
force (bool, default: False ) –

Create a new experiment even if experiment with same name already exists.
name (str, default: None ) –

Display the name of experiment.
description (str, default: None ) –

Description of the experiment. (only available from Driverless AI version 2.0 onwards)

Other Parameters:

accuracy (int) –

Accuracy setting [1-10].
time (int) –

Time setting [1-10].
interpretability (int) –

Interpretability setting [1-10].
scorer (Union[str, ScorerRecipe]) –

Metric to optimize for.
models (Union[str, ModelRecipe]) –

Limit experiments to these models.
transformers (Union[str, TransformerRecipe]) –

Limit experiments to these transformers.
validation_dataset (Dataset) –

Dataset object.
test_dataset (Dataset) –

Dataset object.
weight_column (str) –

Name of the column in train_dataset.
fold_column (str) –

Name of the column in train_dataset
time_column (str) –

Name of the column in train_dataset, containing time ordering for timeseries problems
time_groups_columns (List[str]) –

List of column names, contributing to time ordering.
unavailable_at_prediction_time_columns (List[str]) –

List of column names, which won't be present at prediction time.
drop_columns (List[str]) –

List of column names that need to be dropped.
enable_gpus (bool) –

Allow the usage of GPUs in the experiment.
reproducible (bool) –

Set the experiment to be reproducible.
time_period_in_seconds (int) –

The length of the time period in seconds, used in the timeseries problems.
num_prediction_periods (int) –

Timeseries forecast horizon in time period units.
num_gap_periods (int) –

The number of time periods after which the forecast starts.
config_overrides (str) –

Driverless AI config overrides in TOML string format.

Example: Create an experiment.

client = driverlessai.Client(address='http://localhost:12345',
username='py', password='py')
train_dataset = client.datasets.create(
    data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTrain.csv",
    data_source="s3",
    name="Airlines-data",
    description="My airline dataset",
)
test_dataset = client.datasets.create(
    data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTest.csv",
    data_source="s3",
    name="Airlines-data",
    description="My airline dataset",
)
experiment = client.experiments.create(
    dataset=train_dataset,
    target_col=train_dataset.columns[-1],
    test_dataset=test_dataset,
    task='classification',
    scorer='F1',
    accuracy=5,
    time=5,
    interpretability=5,
    name='demo_day_experiment'
)

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

create_async ¶

create_async(
    train_dataset: Dataset,
    target_column: str | None,
    task: str,
    force: bool = False,
    name: str = None,
    description: str | None = None,
    **kwargs: Any
) -> Experiment

Launches the creation of an experiment in the Driverless AI server and returns an experiment object to track the experiment status.

Parameters:

train_dataset (Dataset) –

Dataset object.
target_column (str | None) –

The name of column in train_dataset (pass None if task is 'unsupervised').
task (str) –

One of 'regression', 'classification', or 'unsupervised'.
force (bool, default: False ) –

Create a new experiment even if experiment with same name already exists.
name (str, default: None ) –

The display name for the experiment.
description (str | None, default: None ) –

Description of the experiment. (only available from Driverless AI version 2.0 onwards)

Other Parameters:

accuracy (int) –

Accuracy setting [1-10].
time (int) –

Time setting [1-10].
interpretability (int) –

Interpretability setting [1-10].
scorer (Union[str, ScorerRecipe]) –

Metric to optimize for.
models (Union[str, ModelRecipe]) –

Limit experiments to these models.
transformers (Union[str, TransformerRecipe]) –

Limit experiments to these transformers.
validation_dataset (Dataset) –

Dataset object.
test_dataset (Dataset) –

Dataset object.
weight_column (str) –

Name of the column in train_dataset.
fold_column (str) –

Name of the column in train_dataset
time_column (str) –

Name of the column in train_dataset, containing time ordering for timeseries problems
time_groups_columns (List[str]) –

List of column names, contributing to time ordering.
unavailable_at_prediction_time_columns (List[str]) –

List of column names, which won't be present at prediction time.
drop_columns (List[str]) –

List of column names that need to be dropped.
enable_gpus (bool) –

Allow the usage of GPUs in the experiment.
reproducible (bool) –

Set the experiment to be reproducible.
time_period_in_seconds (int) –

The length of the time period in seconds, used in the timeseries problems.
num_prediction_periods (int) –

Timeseries forecast horizon in time period units.
num_gap_periods (int) –

The number of time periods after which the forecast starts.
config_overrides (str) –

Driverless AI config overrides in TOML string format.

Example: Create an async experiment.

client = driverlessai.Client(address='http://localhost:12345',
username='py', password='py')
train_dataset = client.datasets.create(
    data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTrain.csv",
    data_source="s3",
    name="Airlines-data",
    description="My airline dataset",
)
test_dataset = client.datasets.create(
    data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTest.csv",
    data_source="s3",
    name="Airlines-data",
    description="My airline dataset",
)
experiment = client.experiments.create_async(
    dataset=train_dataset,
    target_col=train_dataset.columns[-1],
    test_dataset=test_dataset,
    task='classification',
    scorer='F1',
    accuracy=5,
    time=5,
    interpretability=5,
    name='demo_day_experiment'
)

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

get ¶

get(key: str) -> Experiment

Returns an Experiment object corresponding to an experiment on the Driverless AI server. If the experiment only exists on H2O.ai Storage, it will be imported to the server first.

Parameters:

key (str) –

Driverless AI server's unique ID for the experiment.

get_by_name ¶

get_by_name(name: str) -> Experiment | None

Get the experiment specified by the name.

Parameters:

name (str) –

Name of the experiment.

Returns:

Experiment | None –

The experiment with the name if exists, otherwise None.

Beta API

A beta API that is subject to future changes.

gui ¶

gui() -> Hyperlink

Returns the complete URL to the experiment details page in the Driverless AI server.

import_dai_file ¶

import_dai_file(
    path: str, file_system: AbstractFileSystem | None = None
) -> Experiment

Imports a DAI file to the Driverless AI server and return a corresponding Experiment object.

Parameters:

path (str) –

Path to the .dai file.
file_system (AbstractFileSystem | None, default: None ) –

The FSSPEC based file system to download from, instead of the local file system.

leaderboard ¶

leaderboard(
    train_dataset: Dataset,
    target_column: str | None,
    task: str,
    force: bool = False,
    name: str = None,
    **kwargs: Any
) -> Project

Launches an experiment leaderboard in the Driverless AI server and return a project object to track experiment statuses.

Parameters:

train_dataset (Dataset) –

Dataset object.
target_column (str | None) –

The name of column in train_dataset (pass None if task is 'unsupervised').
task (str) –

One of 'regression', 'classification', or 'unsupervised'.
force (bool, default: False ) –

Create a new project even if a project with the same name already exists.
name (str, default: None ) –

The display name for the project.

Other Parameters:

accuracy (int) –

Accuracy setting [1-10].
time (int) –

Time setting [1-10].
interpretability (int) –

Interpretability setting [1-10].
scorer (Union[str, ScorerRecipe]) –

Metric to optimize for.
models (Union[str, ModelRecipe]) –

Limit experiments to these models.
transformers (Union[str, TransformerRecipe]) –

Limit experiments to these transformers.
validation_dataset (Dataset) –

Dataset object.
test_dataset (Dataset) –

Dataset object.
weight_column (str) –

Name of the column in train_dataset.
fold_column (str) –

Name of the column in train_dataset
time_column (str) –

Name of the column in train_dataset, containing time ordering for timeseries problems
time_groups_columns (List[str]) –

List of column names, contributing to time ordering.
unavailable_at_prediction_time_columns (List[str]) –

List of column names, which won't be present at prediction time.
drop_columns (List[str]) –

List of column names that need to be dropped.
enable_gpus (bool) –

Allow the usage of GPUs in the experiment.
reproducible (bool) –

Set the experiment to be reproducible.
time_period_in_seconds (int) –

The length of the time period in seconds, used in the timeseries problems.
num_prediction_periods (int) –

Timeseries forecast horizon in time period units.
num_gap_periods (int) –

The number of time periods after which the forecast starts.
config_overrides (str) –

Driverless AI config overrides in TOML string format.

Returns:

Project –

A project object to track experiment statuses.

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

list ¶

list(start_index: int = 0, count: int = None) -> Sequence[Experiment]

List of Experiment objects available to the user.

Parameters:

start_index (int, default: 0 ) –

The index number on the Driverless AI server of the first experiment in the list.
count (int, default: None ) –

The number of experiments to request from the Driverless AI server.

Returns:

Sequence[Experiment] –

Experiments.

preview ¶

preview(
    train_dataset: Dataset,
    target_column: str | None,
    task: str,
    force: bool | None = None,
    name: str | None = None,
    **kwargs: Any
) -> None

Prints a preview of experiment for the given settings.

Parameters:

train_dataset (Dataset) –

Dataset object.
target_column (str | None) –

The name of column in train_dataset (pass None if task is 'unsupervised').
task (str) –

One of 'regression', 'classification', or 'unsupervised'.
force (bool | None, default: None ) –

Ignored (preview accepts the same arguments as create).
name (str | None, default: None ) –

Ignored (preview accepts the same arguments as create).

Other Parameters:

accuracy (int) –

Accuracy setting [1-10].
time (int) –

Time setting [1-10].
interpretability (int) –

Interpretability setting [1-10].
scorer (Union[str, ScorerRecipe]) –

Metric to optimize for.
models (Union[str, ModelRecipe]) –

Limit experiments to these models.
transformers (Union[str, TransformerRecipe]) –

Limit experiments to these transformers.
validation_dataset (Dataset) –

Dataset object.
test_dataset (Dataset) –

Dataset object.
weight_column (str) –

Name of the column in train_dataset.
fold_column (str) –

Name of the column in train_dataset
time_column (str) –

Name of the column in train_dataset, containing time ordering for timeseries problems
time_groups_columns (List[str]) –

List of column names, contributing to time ordering.
unavailable_at_prediction_time_columns (List[str]) –

List of column names, which won't be present at prediction time.
drop_columns (List[str]) –

List of column names that need to be dropped.
enable_gpus (bool) –

Allow the usage of GPUs in the experiment.
reproducible (bool) –

Set the experiment to be reproducible.
time_period_in_seconds (int) –

The length of the time period in seconds, used in the timeseries problems.
num_prediction_periods (int) –

Timeseries forecast horizon in time period units.
num_gap_periods (int) –

The number of time periods after which the forecast starts.
config_overrides (str) –

Driverless AI config overrides in TOML string format.

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

search_expert_settings ¶

search_expert_settings(
    search_term: str, show_description: bool = False
) -> None

Search expert settings and print results. Useful when looking for kwargs to use when creating experiments.

Parameters:

search_term (str) –

Term to search for (case-insensitive).
show_description (bool, default: False ) –

Include description in results.

Experiment ¶

Interact with an experiment in the Driverless AI server.

artifacts `property` ¶

artifacts: ExperimentArtifacts

Artifacts that are created from a completed experiment.

Returns:

ExperimentArtifacts –

creation_timestamp `property` ¶

creation_timestamp: float

Creation timestamp in seconds since the epoch (POSIX timestamp).

Returns:

float –

datasets `property` ¶

datasets: dict[str, Dataset | None]

Dictionary of train_dataset, validation_dataset, and test_dataset used for the experiment.

Example: Get train/valid/test datasets in the experiment.

datasets = experiment.datasets()
train_dataset = datasets["train_dataset"]
validation_dataset = datasets["validation_dataset"]
test_dataset = datasets["test_dataset"]

Returns:

dict[str, Dataset | None] –

description `property` ¶

description: str | None

Description of the experiment.

Driverless AI version requirement

Requires Driverless AI server 2.0 or higher.

Returns:

str | None –

is_deprecated `property` ¶

is_deprecated: bool

True if experiment was created by an old version of Driverless AI and is no longer fully compatible with the current server version.

Returns:

bool –

key `property` ¶

key: str

Universally unique key of the entity.

Returns:

str –

log `property` ¶

log: ExperimentLog

Interact with experiment logs.

Returns:

ExperimentLog –

metric_plots `property` ¶

metric_plots: ExperimentMetricPlots | None

Metric plots of this model diagnostic.

Beta API

A beta API that is subject to future changes.

Returns:

ExperimentMetricPlots | None –

name `property` ¶

name: str

Name of the entity.

Returns:

str –

run_duration `property` ¶

run_duration: float | None

Run duration in seconds.

Returns:

float | None –

settings `property` ¶

settings: dict[str, Any]

Experiment settings.

Returns:

dict[str, Any] –

size `property` ¶

size: int

Size in bytes of all experiment's files in the Driverless AI server.

Returns:

int –

summary `property` ¶

summary: str | None

An experiment summary that provides a brief overview of the experiment setup and results.

Returns:

str | None –

abort ¶

abort() -> None

Terminate experiment immediately and only generate logs.

autodoc ¶

autodoc() -> AutoDoc

Returns the autodoc generated for this experiment. If it has not generated, creates a new autodoc and returns.

compare_settings_with ¶

compare_settings_with(experiment_to_compare_with: Experiment) -> Table

Compares settings of the experiment with another experiment.

Parameters:

experiment_to_compare_with (Experiment) –

The experiment to compare the settings with.

Returns:

Table –

A comparison table highlighting any differences in settings between the
Table –

experiment and another specified experiment.

compare_setup_with ¶

compare_setup_with(experiment_to_compare_with: Experiment) -> dict[str, Table]

Compares the setup of the experiment with another given experiment.

Parameters:

experiment_to_compare_with (Experiment) –

The experiment to compare the setups with.

delete ¶

delete() -> None

Permanently deletes the experiment from the Driverless AI server.

export_dai_file ¶

export_dai_file(
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Export the experiment from Driverless AI server in DAI format.

Parameters:

dst_dir (str, default: '.' ) –

The path to the directory where the DAI file will be saved.
dst_file (str | None, default: None ) –

The name of the DAI file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

The FSSPEC based file system to download to, instead of the local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

export_triton_model ¶

export_triton_model(
    deploy_predictions: bool = True,
    deploy_shapley: bool = False,
    deploy_original_shapley: bool = False,
    enable_high_concurrency: bool = False,
) -> TritonModelArtifact

Exports the model of this experiment as a Triton model.

Parameters:

deploy_predictions (bool, default: True ) –

whether to deploy model predictions
deploy_shapley (bool, default: False ) –

whether to deploy model Shapley
deploy_original_shapley (bool, default: False ) –

whether to deploy model original Shapley
enable_high_concurrency (bool, default: False ) –

whether to enable handling multiple requests at once

Returns: a triton model

Beta API

A beta API that is subject to future changes.

finish ¶

finish() -> None

Finish experiment by jumping to final pipeline training and generating experiment artifacts.

fit_and_transform ¶

fit_and_transform(
    training_dataset: Dataset,
    validation_split_fraction: float = 0,
    seed: int = 1234,
    fold_column: str = None,
    test_dataset: Dataset = None,
    validation_dataset: Dataset = None,
) -> FitAndTransformation

Transform a dataset, then return a FitAndTransformation object.

Parameters:

training_dataset (Dataset) –

The dataset to be used for refitting the data transformation pipeline.
validation_split_fraction (float, default: 0 ) –

The fraction of data used for validation.
seed (int, default: 1234 ) –

A random seed to use to start a random generator.
fold_column (str, default: None ) –

The column to create a stratified validation split.
test_dataset (Dataset, default: None ) –

The dataset to be used for final testing.
validation_dataset (Dataset, default: None ) –

The dataset to be used for tune parameters of models.

fit_and_transform_async ¶

fit_and_transform_async(
    training_dataset: Dataset,
    validation_split_fraction: float = 0,
    seed: int = 1234,
    fold_column: str = None,
    test_dataset: Dataset = None,
    validation_dataset: Dataset = None,
) -> FitAndTransformationJob

Launch transform job on a dataset and return a FitAndTransformationJob object to track the status.

Parameters:

training_dataset (Dataset) –

The dataset to be used for refitting the data transformation pipeline.
validation_split_fraction (float, default: 0 ) –

The fraction of data used for validation.
seed (int, default: 1234 ) –

A random seed to use to start a random generator.
fold_column (str, default: None ) –

The column to create a stratified validation split.
test_dataset (Dataset, default: None ) –

The dataset to be used for final testing.
validation_dataset (Dataset, default: None ) –

The dataset to be used for tune parameters of models.

get_linked_projects ¶

get_linked_projects() -> list[Project]

Get all the projects that the current experiment belongs to.

Driverless AI version requirement

Requires Driverless AI server 1.10.5 or higher.

get_previous_predictions ¶

get_previous_predictions() -> list[Prediction]

Get all previous predictions of the current experiment.

Beta API

A beta API that is subject to future changes.

Driverless AI version requirement

Requires Driverless AI server 1.11.0 or higher.

gui ¶

gui() -> Hyperlink

Obtains the complete URL for the experiment's page in the Driverless AI server.

is_complete ¶

is_complete() -> bool

Whether the job has been completed successfully.

Returns:

bool –

True if the job has been completed successfully, otherwise False.

is_running ¶

is_running() -> bool

Whether the job has been scheduled or is running, finishing, or syncing.

Returns:

bool –

True if the job has not completed yet, otherwise False.

metrics ¶

metrics() -> dict[str, str | float]

Return dictionary of experiment scorer metrics and AUC metrics, if available.

notifications ¶

notifications() -> list[dict[str, str]]

Return list of experiment notification dictionaries.

predict ¶

predict(
    dataset: Dataset | DataFrame,
    enable_mojo: bool = True,
    include_columns: list[str] | None = None,
    include_labels: bool | None = None,
    include_raw_outputs: bool | None = None,
    include_shap_values_for_original_features: bool | None = None,
    include_shap_values_for_transformed_features: bool | None = None,
    use_fast_approx_for_shap_values: bool | None = None,
) -> Prediction

Predict on a dataset, then return a Prediction object.

Parameters:

dataset (Dataset | DataFrame) –

A Dataset or a Pandas DataFrame that can be predicted.
enable_mojo (bool, default: True ) –

Use MOJO (if available) to make predictions.
include_columns (list[str] | None, default: None ) –

The list of columns from the dataset to append to the prediction CSV.
include_labels (bool | None, default: None ) –

Append labels in addition to probabilities for classification, ignored for regression.
include_raw_outputs (bool | None, default: None ) –

Append predictions as margins (in link space) to the prediction CSV.
include_shap_values_for_original_features (bool | None, default: None ) –

Append original feature contributions to the prediction CSV.
include_shap_values_for_transformed_features (bool | None, default: None ) –

Append transformed feature contributions to the prediction CSV.
use_fast_approx_for_shap_values (bool | None, default: None ) –

Speed up prediction contributions with approximation.

predict_async ¶

predict_async(
    dataset: Dataset | DataFrame,
    enable_mojo: bool = True,
    include_columns: list[str] | None = None,
    include_labels: bool | None = None,
    include_raw_outputs: bool | None = None,
    include_shap_values_for_original_features: bool | None = None,
    include_shap_values_for_transformed_features: bool | None = None,
    use_fast_approx_for_shap_values: bool | None = None,
) -> PredictionJobs

Launch prediction job on a dataset and return a PredictionJobs object to track the status.

Parameters:

dataset (Dataset | DataFrame) –

A Dataset or a Pandas DataFrame that can be predicted.
enable_mojo (bool, default: True ) –

Use MOJO (if available) to make predictions.
include_columns (list[str] | None, default: None ) –

The list of columns from the dataset to append to the prediction CSV.
include_labels (bool | None, default: None ) –

Append labels in addition to probabilities for classification, ignored for regression.
include_raw_outputs (bool | None, default: None ) –

Append predictions as margins (in link space) to the prediction CSV.
include_shap_values_for_original_features (bool | None, default: None ) –

Append original feature contributions to the prediction CSV.
include_shap_values_for_transformed_features (bool | None, default: None ) –

Append transformed feature contributions to the prediction CSV.
use_fast_approx_for_shap_values (bool | None, default: None ) –

Speed up prediction contributions with approximation.

redescribe ¶

redescribe(description: str) -> Experiment

Change experiment description. Args: description: New description.

Driverless AI version requirement

Requires Driverless AI server 2.0 or higher.

rename ¶

rename(name: str) -> Experiment

Change experiment display name.

Parameters:

name (str) –

New display name.

result ¶

result(silent: bool = False) -> Experiment

Wait for training to complete, then return self.

Parameters:

silent (bool, default: False ) –

If True, do not display status updates.

retrain ¶

retrain(
    use_smart_checkpoint: bool = False,
    final_pipeline_only: bool = False,
    final_models_only: bool = False,
    **kwargs: Any
) -> Experiment

Create a new experiment using the same datasets and settings. Through kwargs it's possible to pass new datasets or overwrite settings.

Parameters:

use_smart_checkpoint (bool, default: False ) –

Start the experiment from the last smart checkpoint.
final_pipeline_only (bool, default: False ) –

Trains the final pipeline using smart checkpoint if available, otherwise uses default hyperparameters.
final_models_only (bool, default: False ) –

Trains the final pipeline models (but not transformers) using smart checkpoint if available, otherwise uses default hyperparameters and transformers (overrides final_pipeline_only).
kwargs (Any, default: {} ) –

Datasets and experiment settings as defined in experiments.create().

retrain_async ¶

retrain_async(
    use_smart_checkpoint: bool = False,
    final_pipeline_only: bool = False,
    final_models_only: bool = False,
    **kwargs: Any
) -> Experiment

Launch creation of a new experiment using the same datasets and settings. Through kwargs it's possible to pass new datasets or overwrite settings.

Parameters:

use_smart_checkpoint (bool, default: False ) –

Start the experiment from the last smart checkpoint.
final_pipeline_only (bool, default: False ) –

Trains the final pipeline using smart checkpoint if available, otherwise uses default hyperparameters.
final_models_only (bool, default: False ) –

Trains the final pipeline models (but not transformers) using smart checkpoint if available, otherwise uses default hyperparameters and transformers (overrides final_pipeline_only).
kwargs (Any, default: {} ) –

Datasets and experiment settings as defined in experiments.create().

status ¶

status(verbose: int = 0) -> str

Returns the status of the job.

Parameters:

verbose (int, default: 0 ) –
- 0: A short description.
- 1: A short description with a progress percentage.
- 2: A detailed description with a progress percentage.

Returns:

str –

Current status of the job.

to_dict ¶

to_dict() -> Dict | object

Dump experiment meta data to a python dictionary

Beta API

A beta API that is subject to future changes.

transform ¶

transform(
    dataset: Dataset,
    enable_mojo: bool = True,
    include_columns: list[str] | None = None,
    include_labels: bool | None = True,
) -> Transformation

Transform a dataset, then return a Transformation object.

Parameters:

dataset (Dataset) –

A Dataset that can be predicted.
enable_mojo (bool, default: True ) –

Use MOJO (if available) to make transformation.
include_columns (list[str] | None, default: None ) –

List of columns from the dataset to append to the prediction CSV.
include_labels (bool | None, default: True ) –

Append labels in addition to probabilities for classification, ignored for regression.

transform_async ¶

transform_async(
    dataset: Dataset,
    enable_mojo: bool = True,
    include_columns: list[str] | None = None,
    include_labels: bool | None = None,
) -> TransformationJob

Launch transform job on a dataset and return a TransformationJob object to track the status.

Parameters:

dataset (Dataset) –

A Dataset that can be predicted.
enable_mojo (bool, default: True ) –

Use MOJO (if available) to make transformation.
include_columns (list[str] | None, default: None ) –

List of columns from the dataset to append to the prediction CSV.
include_labels (bool | None, default: None ) –

Append labels in addition to probabilities for classification, ignored for regression.

Driverless AI version requirement

Requires Driverless AI server 1.10.4.1 or higher.

variable_importance ¶

variable_importance(
    iteration: int = None, model_index: int = None
) -> Table | None

Get variable importance of an iteration in a Table.

Parameters:

iteration (int, default: None ) –

Zero-based index of the iteration of the experiment.
model_index (int, default: None ) –

The zero-based index of model that was generated in a particular iteration.

ExperimentMetricPlots ¶

Interact with the metric plots of an experiment in the Driverless AI server.

actual_vs_predicted_chart `property` ¶

actual_vs_predicted_chart: dict[str, Any] | None

Actual vs predicted chart for the model.

Returns:

dict[str, Any] | None –

An actual vs predicted chart in Vega Lite (v3) format, or None is the model is a classification model.

gains_chart `property` ¶

gains_chart: dict[str, Any] | None

Cumulative gains chart for the model.

Returns:

dict[str, Any] | None –

A cumulative gains chart in Vega Lite (v3) format, or None is the model is a classification model.

ks_chart `property` ¶

ks_chart: dict[str, Any] | None

Kolmogorov-Smirnov chart of the model.

Returns:

dict[str, Any] | None –

A Kolmogorov-Smirnov chart in Vega Lite (v3) format, or None if the model is not a classification model.

lift_chart `property` ¶

lift_chart: dict[str, Any] | None

Lift chart of the model.

Returns:

dict[str, Any] | None –

A lift chart in Vega Lite (v3) format, or None is the model is a classification model.

prec_recall_curve `property` ¶

prec_recall_curve: dict[str, Any] | None

Precision-recall curve of the model.

Returns:

dict[str, Any] | None –

A precision-recall curve in Vega Lite (v3) format, or None is the model is a classification model.

residual_plot `property` ¶

residual_plot: dict[str, Any] | None

Residual plot with LOESS curve of the model.

Returns:

dict[str, Any] | None –

A residual plot in Vega Lite (v3) format, or None is the model is a classification model.

roc_curve `property` ¶

roc_curve: dict[str, Any] | None

ROC curve of the model.

Returns:

dict[str, Any] | None –

A ROC curve in Vega Lite (v3) format, or None is the model is a classification model

confusion_matrix ¶

confusion_matrix(threshold: float = None) -> list[list[Any]] | None

Confusion matrix of the model.

Parameters:

threshold (float, default: None ) –

The threshold value.

Returns:

list[list[Any]] | None –

A confusion matrix as a 2D list, or None is the model is a classification model

ExperimentArtifacts ¶

Interact with files created by an experiment in the Driverless AI server.

file_paths `property` ¶

file_paths: dict[str, str]

Paths to artifact files on the server.

Returns:

dict[str, str] –

create ¶

create(artifact: str) -> None

(Re)build certain artifacts, if possible.

(re)buildable artifacts:

'autodoc'
'mojo_pipeline'
'python_pipeline'

Parameters:

artifact (str) –

The name of the artifact to (re)build.

download ¶

download(
    only: str | list[str] = None,
    dst_dir: str = ".",
    file_system: AbstractFileSystem | None = None,
    include_columns: list[str] | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> dict[str, str]

Download experiment artifacts from the Driverless AI server. Returns a dictionary of relative paths for the downloaded artifacts.

Parameters:

only (str | list[str], default: None ) –

Specify the specific artifacts to download, use experiment.artifacts.list() to see the available artifacts in the Driverless AI server.
dst_dir (str, default: '.' ) –

The path to the directory where the experiment artifacts will be saved.
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of local file system.
include_columns (list[str] | None, default: None ) –

The list of dataset columns to append to prediction CSVs.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

export ¶

export(
    only: str | list[str] | None = None,
    include_columns: list[str] | None = None,
    **kwargs: Any
) -> dict[str, str]

Export experiment artifacts from the Driverless AI server. Returns a dictionary of relative paths for the exported artifacts.

Parameters:

only (str | list[str] | None, default: None ) –

Specify the specific artifacts to download, use experiment.artifacts.list() to see the available artifacts in the Driverless AI server.
include_columns (list[str] | None, default: None ) –

The list of dataset columns to append to prediction CSVs.

Note

Export location is configured in the Driverless AI server.

list ¶

list() -> list[str]

List of experiment artifacts that exist in the Driverless AI server.

ExperimentLog ¶

Interact with experiment logs.

file_name `property` ¶

file_name: str

Filename of the log file.

Returns:

str –

download ¶

download(
    archive: bool = True,
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download experiment logs from the Driverless AI server.

Parameters:

archive (bool, default: True ) –

If available, it is recommended to download an archive that contains multiple log files and stack traces if any were created.
dst_dir (str, default: '.' ) –

The path to the directory where the logs will be saved.
dst_file (str | None, default: None ) –

The name of the log file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of the local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

head ¶

head(num_lines: int = 50) -> str

Returns the first n lines of the log file.

Parameters:

num_lines (int, default: 50 ) –

Number of lines to retrieve.

tail ¶

tail(num_lines: int = 50) -> str

Returns the last n lines of the log file.

Parameters:

num_lines (int, default: 50 ) –

Number of lines to retrieve.

PredictionJobs ¶

Monitor the creation of predictions in the Driverless AI server.

included_dataset_columns `property` ¶

included_dataset_columns: list[str]

Columns from the dataset that are appended to predictions.

Returns:

list[str] –

includes_labels `property` ¶

includes_labels: bool

Determines whether classification labels are appended to predictions.

Returns:

bool –

includes_raw_outputs `property` ¶

includes_raw_outputs: bool

Whether predictions as margins (in link space) are appended to predictions.

Returns:

bool –

includes_shap_values_for_original_features `property` ¶

includes_shap_values_for_original_features: bool

Whether original feature contributions are appended to predictions.

Returns:

bool –

includes_shap_values_for_transformed_features `property` ¶

includes_shap_values_for_transformed_features: bool

Whether transformed feature contributions are appended to predictions.

Returns:

bool –

jobs `property` ¶

jobs: Sequence[ServerJob]

Monitoring jobs.

Returns:

Sequence[ServerJob] –

keys `property` ¶

keys: dict[str, str]

Dictionary of the entity unique IDs:

Parameters:

Dataset –

The unique ID of dataset used to make predictions.
Experiment –

The unique ID of experiments used to make predictions.
Prediction –

The unique ID of predictions.

Returns:

dict[str, str] –

used_fast_approx_for_shap_values `property` ¶

used_fast_approx_for_shap_values: bool | None

Whether approximation was used to calculate prediction contributions.

Returns:

bool | None –

is_complete ¶

is_complete() -> bool

Whether all jobs have been completed successfully.

Returns:

bool –

True if all jobs have been completed successfully, otherwise False.

is_running ¶

is_running() -> bool

Whether one or more jobs have been scheduled or is running, finishing, or syncing.

Returns:

bool –

True if one or more jobs have not completed yet, otherwise False.

result ¶

result(silent: bool = False) -> Prediction

Waits for the job to complete.

Parameters:

silent (bool, default: False ) –

If True, do not display status updates.

Returns:

Prediction –

The Prediction job results.

status ¶

status(verbose: int = 0) -> list[str]

Returns the statuses of all jobs.

Parameters:

verbose (int, default: 0 ) –
- 0: A short description.
- 1: A short description with a progress percentage.
- 2: A detailed description with a progress percentage.

Returns:

list[str] –

Current statuses of all jobs.

Prediction ¶

Interact with predictions from the Driverless AI server.

file_paths `property` ¶

file_paths: list[str]

Paths to the prediction CSV files on the server.

Returns:

list[str] –

included_dataset_columns `property` ¶

included_dataset_columns: list[str]

Columns from the dataset that are appended to predictions.

Returns:

list[str] –

includes_labels `property` ¶

includes_labels: bool

Determines whether classification labels are appended to predictions.

Returns:

bool –

includes_raw_outputs `property` ¶

includes_raw_outputs: bool

Determines whether predictions as margins (in link space) were appended to predictions.

Returns:

bool –

includes_shap_values_for_original_features `property` ¶

includes_shap_values_for_original_features: bool

Determines whether original feature contributions are appended to predictions.

Returns:

bool –

includes_shap_values_for_transformed_features `property` ¶

includes_shap_values_for_transformed_features: bool

Determines whether transformed feature contributions are appended to predictions.

Returns:

bool –

keys `property` ¶

keys: dict[str, str]

Dictionary of unique IDs for entities related to the prediction:

dataset: The unique ID of the dataset used to make predictions. experiment: The unique ID of the experiment used to make predictions. prediction: The unique ID of the predictions.

Returns:

dict[str, str] –

used_fast_approx_for_shap_values `property` ¶

used_fast_approx_for_shap_values: bool | None

Whether approximation was used to calculate prediction contributions.

Returns:

bool | None –

download ¶

download(
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Downloads the predictions of the experiment in CSV format.

Parameters:

dst_dir (str, default: '.' ) –

The path to the directory where the CSV file will be saved.
dst_file (str | None, default: None ) –

The name of the CSV file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

to_pandas ¶

to_pandas() -> DataFrame

Transfers the predictions to a local Pandas DataFrame.

Transformation ¶

Interact with transformed data from the Driverless AI server.

file_path `property` ¶

file_path: str

Paths to the transformed CSV files on the server.

Returns:

str –

included_dataset_columns `property` ¶

included_dataset_columns: list[str]

Columns from the dataset that are appended to transformed data.

Returns:

list[str] –

includes_labels `property` ¶

includes_labels: bool

Determines whether classification labels are appended to transformed data.

Returns:

bool –

keys `property` ¶

keys: dict[str, str]

Dictionary of unique IDs for entities related to the transformed data:

dataset: The unique ID of the dataset used to make predictions. experiment: The unique ID of the experiment used to make predictions. prediction: The unique ID of the predictions.

Returns:

dict[str, str] –

download ¶

download(
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Downloads a CSV of transformed data.

Parameters:

dst_dir (str, default: '.' ) –

The path to the directory where the CSV file will be saved.
dst_file (str | None, default: None ) –

The name of the CSV file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

to_pandas ¶

to_pandas() -> DataFrame

Transfers the transformed data to a local Pandas DataFrame.

TransformationJob ¶

Monitor the creation of data transformation in the Driverless AI server.

included_dataset_columns `property` ¶

included_dataset_columns: list[str]

Columns from the dataset that are appended to transformed data.

Returns:

list[str] –

includes_labels `property` ¶

includes_labels: bool

Determines whether classification labels are appended to transformed data.

Returns:

bool –

key `property` ¶

key: str

Universally unique key of the entity.

Returns:

str –

keys `property` ¶

keys: dict[str, str]

Dictionary of the entity unique IDs:

Parameters:

Dataset –

The unique ID of dataset used to make predictions.
Experiment –

The unique ID of experiments used to make predictions.
Prediction –

The unique ID of predictions.

Returns:

dict[str, str] –

name `property` ¶

name: str

Name of the entity.

Returns:

str –

is_complete ¶

is_complete() -> bool

Whether the job has been completed successfully.

Returns:

bool –

True if the job has been completed successfully, otherwise False.

is_running ¶

is_running() -> bool

Whether the job has been scheduled or is running, finishing, or syncing.

Returns:

bool –

True if the job has not completed yet, otherwise False.

result ¶

result(silent: bool = False) -> Transformation

Waits for the job to complete, then returns self.

Parameters:

silent (bool, default: False ) –

If True, do not display status updates.

status ¶

status(verbose: int = None) -> str

Returns short job status description string.

FitAndTransformation ¶

Interact with fit and transformed data from the Driverless AI server.

fold_column `property` ¶

fold_column: str

Column that creates the stratified validation split.

Returns:

str –

seed `property` ¶

seed: int

Random seed that used to start a random generator.

Returns:

int –

test_dataset `property` ¶

test_dataset: Dataset | None

Test dataset used for this transformation.

Returns:

Dataset | None –

training_dataset `property` ¶

training_dataset: Dataset

Training dataset used for this transformation.

Returns:

Dataset –

validation_dataset `property` ¶

validation_dataset: Dataset | None

Validation dataset used for this transformation.

Returns:

Dataset | None –

validation_split_fraction `property` ¶

validation_split_fraction: float

Fraction of data used for validation.

Returns:

float –

download_transformed_test_dataset ¶

download_transformed_test_dataset(
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download fit and transformed test dataset in CSV format.

Parameters:

dst_dir (str, default: '.' ) –

The path to the directory where the CSV file will be saved.
dst_file (str | None, default: None ) –

The name of the CSV file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

download_transformed_training_dataset ¶

download_transformed_training_dataset(
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download fit and transformed training dataset in CSV format.

Parameters:

dst_dir (str, default: '.' ) –

The path to the directory where the CSV file will be saved.
dst_file (str | None, default: None ) –

The name of the CSV file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

download_transformed_validation_dataset ¶

download_transformed_validation_dataset(
    dst_dir: str = ".",
    dst_file: str | None = None,
    file_system: AbstractFileSystem | None = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download fit and transformed validation dataset in CSV format.

Parameters:

dst_dir (str, default: '.' ) –

The path to the directory where the CSV file will be saved.
dst_file (str | None, default: None ) –

The name of the CSV file (overrides default file name).
file_system (AbstractFileSystem | None, default: None ) –

FSSPEC based file system to download to, instead of local file system.
overwrite (bool, default: False ) –

Overwrite the existing file.
timeout (float, default: 30 ) –

Connection timeout in seconds.

FitAndTransformationJob ¶

key `property` ¶

key: str

Universally unique key of the entity.

Returns:

str –

name `property` ¶

name: str

Name of the entity.

Returns:

str –

is_complete ¶

is_complete() -> bool

Whether the job has been completed successfully.

Returns:

bool –

True if the job has been completed successfully, otherwise False.

is_running ¶

is_running() -> bool

Whether the job has been scheduled or is running, finishing, or syncing.

Returns:

bool –

True if the job has not completed yet, otherwise False.

result ¶

result(silent: bool = False) -> FitAndTransformation

Wait for the job to complete, then return self.

Args: silent: If True, do not display status updates.

status ¶

status(verbose: int = 0) -> str

Returns the status of the job.

Parameters:

verbose (int, default: 0 ) –
- 0: A short description.
- 1: A short description with a progress percentage.
- 2: A detailed description with a progress percentage.

Returns:

str –

Current status of the job.

Experiments¶

Experiments ¶

create ¶

create_async ¶

get ¶

get_by_name ¶

gui ¶

import_dai_file ¶

leaderboard ¶

list ¶

preview ¶

search_expert_settings ¶

Experiment ¶

artifacts property ¶

creation_timestamp property ¶

datasets property ¶

description property ¶

is_deprecated property ¶

key property ¶

log property ¶

metric_plots property ¶

name property ¶

run_duration property ¶

settings property ¶

size property ¶

summary property ¶

abort ¶

autodoc ¶

compare_settings_with ¶

compare_setup_with ¶

delete ¶

export_dai_file ¶

export_triton_model ¶

finish ¶

fit_and_transform ¶

fit_and_transform_async ¶

get_linked_projects ¶

get_previous_predictions ¶

gui ¶

is_complete ¶

is_running ¶

metrics ¶

notifications ¶

predict ¶

predict_async ¶

redescribe ¶

rename ¶

result ¶

retrain ¶

retrain_async ¶

status ¶

to_dict ¶

transform ¶

transform_async ¶

variable_importance ¶

ExperimentMetricPlots ¶

actual_vs_predicted_chart property ¶

gains_chart property ¶

ks_chart property ¶

lift_chart property ¶

prec_recall_curve property ¶

residual_plot property ¶

roc_curve property ¶

confusion_matrix ¶

ExperimentArtifacts ¶

file_paths property ¶

create ¶

download ¶

export ¶

list ¶

ExperimentLog ¶

file_name property ¶

download ¶

head ¶

tail ¶

PredictionJobs ¶

included_dataset_columns property ¶

includes_labels property ¶

includes_raw_outputs property ¶

includes_shap_values_for_original_features property ¶

artifacts `property` ¶

creation_timestamp `property` ¶

datasets `property` ¶

description `property` ¶

is_deprecated `property` ¶

key `property` ¶

log `property` ¶

metric_plots `property` ¶

name `property` ¶

run_duration `property` ¶

settings `property` ¶

size `property` ¶

summary `property` ¶

actual_vs_predicted_chart `property` ¶

gains_chart `property` ¶

ks_chart `property` ¶

lift_chart `property` ¶

prec_recall_curve `property` ¶

residual_plot `property` ¶

roc_curve `property` ¶

file_paths `property` ¶

file_name `property` ¶

included_dataset_columns `property` ¶

includes_labels `property` ¶

includes_raw_outputs `property` ¶

includes_shap_values_for_original_features `property` ¶

includes_shap_values_for_transformed_features `property` ¶

jobs `property` ¶

keys `property` ¶

used_fast_approx_for_shap_values `property` ¶

file_paths `property` ¶

included_dataset_columns `property` ¶

includes_labels `property` ¶

includes_raw_outputs `property` ¶

includes_shap_values_for_original_features `property` ¶

includes_shap_values_for_transformed_features `property` ¶

keys `property` ¶

used_fast_approx_for_shap_values `property` ¶

file_path `property` ¶

included_dataset_columns `property` ¶

includes_labels `property` ¶

keys `property` ¶

included_dataset_columns `property` ¶

includes_labels `property` ¶

key `property` ¶

keys `property` ¶

name `property` ¶

fold_column `property` ¶

seed `property` ¶

test_dataset `property` ¶

training_dataset `property` ¶

validation_dataset `property` ¶

validation_split_fraction `property` ¶

key `property` ¶

name `property` ¶