Skip to content

Experiment

Experiments

Interact with experiments on the Driverless AI server.

create

create(
    train_dataset: Dataset,
    target_column: Optional[str],
    task: str,
    force: bool = False,
    name: str = None,
    **kwargs: Any
) -> Experiment

Launch an experiment on the Driverless AI server and wait for it to complete before returning.

Parameters:

  • train_dataset (Dataset) –

    Dataset object.

  • target_column (Optional[str]) –

    Name of the column train_dataset (pass None if task is 'unsupervised').

  • task (str) –

    One of 'regression', 'classification', or 'unsupervised'.

  • force (bool, default: False ) –

    Create a new experiment even if experiment with same name already exists.

  • name (str, default: None ) –

    Display the name of experiment.

Other Parameters:

  • accuracy (int) –

    Accuracy setting [1-10].

  • time (int) –

    Time setting [1-10].

  • interpretability (int) –

    Interpretability setting [1-10].

  • scorer (Union[str, ScorerRecipe]) –

    Metric to optimize for.

  • models (Union[str, ModelRecipe]) –

    Limit experiments to these models.

  • transformers (Union[str, TransformerRecipe]) –

    Limit experiments to these transformers.

  • validation_dataset (Dataset) –

    Dataset object.

  • test_dataset (Dataset) –

    Dataset object.

  • weight_column (str) –

    Name of the column in train_dataset.

  • fold_column (str) –

    Name of the column in train_dataset

  • time_column (str) –

    Name of the column in train_dataset, containing time ordering for timeseries problems

  • time_groups_columns (List[str]) –

    List of column names, contributing to time ordering.

  • unavailable_at_prediction_time_columns (List[str]) –

    List of column names, which won't be present at prediction time.

  • drop_columns (List[str]) –

    List of column names that need to be dropped.

  • enable_gpus (bool) –

    Allow the usage of GPUs in the experiment.

  • reproducible (bool) –

    Set the experiment to be reproducible.

  • time_period_in_seconds (int) –

    The length of the time period in seconds, used in the timeseries problems.

  • num_prediction_periods (int) –

    Timeseries forecast horizon in time period units.

  • num_gap_periods (int) –

    The number of time periods after which the forecast starts.

  • config_overrides (str) –

    Driverless AI config overrides in TOML string format.

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

Returns:

create_async

create_async(
    train_dataset: Dataset,
    target_column: Optional[str],
    task: str,
    force: bool = False,
    name: str = None,
    **kwargs: Any
) -> Experiment

Launch an experiment on the Driverless AI server and return an Experiment object to track the experiment status.

Parameters:

  • train_dataset (Dataset) –

    Dataset object.

  • target_column (Optional[str]) –

    The name of column in train_dataset (pass None if task is 'unsupervised').

  • task (str) –

    One of 'regression', 'classification', or 'unsupervised'.

  • force (bool, default: False ) –

    Create a new experiment even if experiment with same name already exists.

  • name (str, default: None ) –

    The display name for the experiment.

Other Parameters:

  • accuracy (int) –

    Accuracy setting [1-10].

  • time (int) –

    Time setting [1-10].

  • interpretability (int) –

    Interpretability setting [1-10].

  • scorer (Union[str, ScorerRecipe]) –

    Metric to optimize for.

  • models (Union[str, ModelRecipe]) –

    Limit experiments to these models.

  • transformers (Union[str, TransformerRecipe]) –

    Limit experiments to these transformers.

  • validation_dataset (Dataset) –

    Dataset object.

  • test_dataset (Dataset) –

    Dataset object.

  • weight_column (str) –

    Name of the column in train_dataset.

  • fold_column (str) –

    Name of the column in train_dataset

  • time_column (str) –

    Name of the column in train_dataset, containing time ordering for timeseries problems

  • time_groups_columns (List[str]) –

    List of column names, contributing to time ordering.

  • unavailable_at_prediction_time_columns (List[str]) –

    List of column names, which won't be present at prediction time.

  • drop_columns (List[str]) –

    List of column names that need to be dropped.

  • enable_gpus (bool) –

    Allow the usage of GPUs in the experiment.

  • reproducible (bool) –

    Set the experiment to be reproducible.

  • time_period_in_seconds (int) –

    The length of the time period in seconds, used in the timeseries problems.

  • num_prediction_periods (int) –

    Timeseries forecast horizon in time period units.

  • num_gap_periods (int) –

    The number of time periods after which the forecast starts.

  • config_overrides (str) –

    Driverless AI config overrides in TOML string format.

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

Returns:

get

get(key: str) -> Experiment

Get an Experiment object corresponding to an experiment on the Driverless AI server. If the experiment only exists on H2O.ai Storage, it will be imported to the server first.

Parameters:

  • key (str) –

    Driverless AI server's unique ID for the experiment.

Returns:

get_by_name

get_by_name(name: str) -> Optional[Experiment]

Get the experiment specified by the name.

Parameters:

  • name (str) –

    Name of the experiment.

Returns:

Beta API

A beta API that is subject to future changes.

gui

gui() -> Hyperlink

Get the full URL for the experiments page on the Driverless AI server.

Returns:

import_dai_file

import_dai_file(
    path: str, file_system: Optional[AbstractFileSystem] = None
) -> Experiment

Import a DAI file to the Driverless AI server and return a corresponding Experiment object.

Parameters:

  • path (str) –

    Path to the .dai file.

  • file_system (Optional[AbstractFileSystem], default: None ) –

    The FSSPEC based file system to download from, instead of the local file system.

Returns:

leaderboard

leaderboard(
    train_dataset: Dataset,
    target_column: Optional[str],
    task: str,
    force: bool = False,
    name: str = None,
    **kwargs: Any
) -> Project

Launch an experiment leaderboard on the Driverless AI server and return a project object to track experiment statuses.

Parameters:

  • train_dataset (Dataset) –

    Dataset object.

  • target_column (Optional[str]) –

    The name of column in train_dataset (pass None if task is 'unsupervised').

  • task (str) –

    One of 'regression', 'classification', or 'unsupervised'.

  • force (bool, default: False ) –

    Create a new project even if a project with the same name already exists.

  • name (str, default: None ) –

    The display name for the project.

Other Parameters:

  • accuracy (int) –

    Accuracy setting [1-10].

  • time (int) –

    Time setting [1-10].

  • interpretability (int) –

    Interpretability setting [1-10].

  • scorer (Union[str, ScorerRecipe]) –

    Metric to optimize for.

  • models (Union[str, ModelRecipe]) –

    Limit experiments to these models.

  • transformers (Union[str, TransformerRecipe]) –

    Limit experiments to these transformers.

  • validation_dataset (Dataset) –

    Dataset object.

  • test_dataset (Dataset) –

    Dataset object.

  • weight_column (str) –

    Name of the column in train_dataset.

  • fold_column (str) –

    Name of the column in train_dataset

  • time_column (str) –

    Name of the column in train_dataset, containing time ordering for timeseries problems

  • time_groups_columns (List[str]) –

    List of column names, contributing to time ordering.

  • unavailable_at_prediction_time_columns (List[str]) –

    List of column names, which won't be present at prediction time.

  • drop_columns (List[str]) –

    List of column names that need to be dropped.

  • enable_gpus (bool) –

    Allow the usage of GPUs in the experiment.

  • reproducible (bool) –

    Set the experiment to be reproducible.

  • time_period_in_seconds (int) –

    The length of the time period in seconds, used in the timeseries problems.

  • num_prediction_periods (int) –

    Timeseries forecast horizon in time period units.

  • num_gap_periods (int) –

    The number of time periods after which the forecast starts.

  • config_overrides (str) –

    Driverless AI config overrides in TOML string format.

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

Returns:

list

list(start_index: int = 0, count: int = None) -> Sequence[Experiment]

List of Experiment objects available to the user.

Parameters:

  • start_index (int, default: 0 ) –

    The index number on the Driverless AI server of the first experiment in the list.

  • count (int, default: None ) –

    The number of experiments to request from the Driverless AI server.

Returns:

preview

preview(
    train_dataset: Dataset,
    target_column: Optional[str],
    task: str,
    force: Optional[bool] = None,
    name: Optional[str] = None,
    **kwargs: Any
) -> None

Print a preview of experiment for the given settings.

Parameters:

  • train_dataset (Dataset) –

    Dataset object.

  • target_column (Optional[str]) –

    The name of column in train_dataset (pass None if task is 'unsupervised').

  • task (str) –

    One of 'regression', 'classification', or 'unsupervised'.

  • force (Optional[bool], default: None ) –

    Ignored (preview accepts the same arguments as create).

  • name (Optional[str], default: None ) –

    Ignored (preview accepts the same arguments as create).

Other Parameters:

  • accuracy (int) –

    Accuracy setting [1-10].

  • time (int) –

    Time setting [1-10].

  • interpretability (int) –

    Interpretability setting [1-10].

  • scorer (Union[str, ScorerRecipe]) –

    Metric to optimize for.

  • models (Union[str, ModelRecipe]) –

    Limit experiments to these models.

  • transformers (Union[str, TransformerRecipe]) –

    Limit experiments to these transformers.

  • validation_dataset (Dataset) –

    Dataset object.

  • test_dataset (Dataset) –

    Dataset object.

  • weight_column (str) –

    Name of the column in train_dataset.

  • fold_column (str) –

    Name of the column in train_dataset

  • time_column (str) –

    Name of the column in train_dataset, containing time ordering for timeseries problems

  • time_groups_columns (List[str]) –

    List of column names, contributing to time ordering.

  • unavailable_at_prediction_time_columns (List[str]) –

    List of column names, which won't be present at prediction time.

  • drop_columns (List[str]) –

    List of column names that need to be dropped.

  • enable_gpus (bool) –

    Allow the usage of GPUs in the experiment.

  • reproducible (bool) –

    Set the experiment to be reproducible.

  • time_period_in_seconds (int) –

    The length of the time period in seconds, used in the timeseries problems.

  • num_prediction_periods (int) –

    Timeseries forecast horizon in time period units.

  • num_gap_periods (int) –

    The number of time periods after which the forecast starts.

  • config_overrides (str) –

    Driverless AI config overrides in TOML string format.

Note

Any expert setting can also be passed as a kwarg. To search possible expert settings for your server version, use experiments.search_expert_settings(search_term).

search_expert_settings

search_expert_settings(
    search_term: str, show_description: bool = False
) -> None

Search expert settings and print results. Useful when looking for kwargs to use when creating experiments.

Parameters:

  • search_term (str) –

    Term to search for (case-insensitive).

  • show_description (bool, default: False ) –

    Include description in results.

Experiment

Interact with an experiment on the Driverless AI server.

artifacts property

Interact with artifacts that are created when the experiment completes.

Returns:

creation_timestamp property

creation_timestamp: float

Creation timestamp in seconds since the epoch (POSIX timestamp).

Returns:

datasets property

datasets: Dict[str, Optional[Dataset]]

Dictionary of train_dataset, validation_dataset, and test_dataset used for the experiment.

Returns:

is_deprecated property

is_deprecated: bool

True if experiment was created by an old version of Driverless AI and is no longer fully compatible with the current server version.

Returns:

key property

key: str

Universally unique key of the entity.

Returns:

log property

Interact with experiment logs.

Returns:

metric_plots property

Metric plots of this model diagnostic.

Beta API

A beta API that is subject to future changes.

Returns:

name property

name: str

Name of the entity.

Returns:

run_duration property

run_duration: Optional[float]

Run duration in seconds.

Returns:

settings property

settings: Dict[str, Any]

Experiment settings.

Returns:

size property

size: int

Size in bytes of all experiment's files on the Driverless AI server.

Returns:

summary property

summary: Optional[str]

An experiment summary that provides a brief overview of the experiment setup and results.

Returns:

abort

abort() -> None

Terminate experiment immediately and only generate logs.

autodoc

autodoc() -> AutoDoc

Returns the autodoc generated for this experiment. If it has not generated, creates a new autodoc and returns.

Returns:

compare_settings_with

compare_settings_with(experiment_to_compare_with: Experiment) -> Table

Compares the settings of the current experiment with another given experiment.

Parameters:

  • experiment_to_compare_with (Experiment) –

    The experiment to compare the settings with.

Returns:

compare_setup_with

compare_setup_with(experiment_to_compare_with: Experiment) -> Dict[str, Table]

Compares the setup of current experiment with another given experiment.

Parameters:

  • experiment_to_compare_with (Experiment) –

    The experiment to compare the setups with.

Returns:

delete

delete() -> None

Permanently delete the experiment from the Driverless AI server.

export_dai_file

export_dai_file(
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Export the experiment from Driverless AI server in DAI format.

Parameters:

  • dst_dir (str, default: '.' ) –

    The path to the directory where the DAI file will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the DAI file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    The FSSPEC based file system to download to, instead of the local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

export_triton_model

export_triton_model(
    deploy_predictions: bool = True,
    deploy_shapley: bool = False,
    deploy_original_shapley: bool = False,
    enable_high_concurrency: bool = False,
) -> TritonModelArtifact

Exports the model of this experiment as a Triton model.

Parameters:

  • deploy_predictions (bool, default: True ) –

    whether to deploy model predictions

  • deploy_shapley (bool, default: False ) –

    whether to deploy model Shapley

  • deploy_original_shapley (bool, default: False ) –

    whether to deploy model original Shapley

  • enable_high_concurrency (bool, default: False ) –

    whether to enable handling multiple requests at once

Returns: a triton model

Beta API

A beta API that is subject to future changes.

Returns:

  • TritonModelArtifact

     

finish

finish() -> None

Finish experiment by jumping to final pipeline training and generating experiment artifacts.

fit_and_transform

fit_and_transform(
    training_dataset: Dataset,
    validation_split_fraction: float = 0,
    seed: int = 1234,
    fold_column: str = None,
    test_dataset: Dataset = None,
    validation_dataset: Dataset = None,
) -> FitAndTransformation

Transform a dataset, then return a FitAndTransformation object.

Parameters:

  • training_dataset (Dataset) –

    The dataset to be used for refitting the data transformation pipeline.

  • validation_split_fraction (float, default: 0 ) –

    The fraction of data used for validation.

  • seed (int, default: 1234 ) –

    A random seed to use to start a random generator.

  • fold_column (str, default: None ) –

    The column to create a stratified validation split.

  • test_dataset (Dataset, default: None ) –

    The dataset to be used for final testing.

  • validation_dataset (Dataset, default: None ) –

    The dataset to be used for tune parameters of models.

Returns:

fit_and_transform_async

fit_and_transform_async(
    training_dataset: Dataset,
    validation_split_fraction: float = 0,
    seed: int = 1234,
    fold_column: str = None,
    test_dataset: Dataset = None,
    validation_dataset: Dataset = None,
) -> FitAndTransformationJob

Launch transform job on a dataset and return a FitAndTransformationJob object to track the status.

Parameters:

  • training_dataset (Dataset) –

    The dataset to be used for refitting the data transformation pipeline.

  • validation_split_fraction (float, default: 0 ) –

    The fraction of data used for validation.

  • seed (int, default: 1234 ) –

    A random seed to use to start a random generator.

  • fold_column (str, default: None ) –

    The column to create a stratified validation split.

  • test_dataset (Dataset, default: None ) –

    The dataset to be used for final testing.

  • validation_dataset (Dataset, default: None ) –

    The dataset to be used for tune parameters of models.

Returns:

get_linked_projects

get_linked_projects() -> List[Project]

Get all the projects that the current experiment belongs to.

Driverless AI version requirement

Requires Driverless AI server 1.10.5 or higher.

Returns:

gui

gui() -> Hyperlink

Get the full URL for the experiment's page on the Driverless AI server.

Returns:

is_complete

is_complete() -> bool

Whether the job has been completed successfully.

Returns:

  • bool

    True if the job has been completed successfully, otherwise False.

is_running

is_running() -> bool

Whether the job has been scheduled or is running, finishing, or syncing.

Returns:

  • bool

    True if the job has not completed yet, otherwise False.

metrics

metrics() -> Dict[str, Union[str, float]]

Return dictionary of experiment scorer metrics and AUC metrics, if available.

Returns:

notifications

notifications() -> List[Dict[str, str]]

Return list of experiment notification dictionaries.

Returns:

predict

predict(
    dataset: Union[Dataset, DataFrame],
    enable_mojo: bool = True,
    include_columns: Optional[List[str]] = None,
    include_labels: Optional[bool] = None,
    include_raw_outputs: Optional[bool] = None,
    include_shap_values_for_original_features: Optional[bool] = None,
    include_shap_values_for_transformed_features: Optional[bool] = None,
    use_fast_approx_for_shap_values: Optional[bool] = None,
) -> Prediction

Predict on a dataset, then return a Prediction object.

Parameters:

  • dataset (Union[Dataset, DataFrame]) –

    A Dataset or a Pandas DataFrame that can be predicted.

  • enable_mojo (bool, default: True ) –

    Use MOJO (if available) to make predictions.

  • include_columns (Optional[List[str]], default: None ) –

    The list of columns from the dataset to append to the prediction CSV.

  • include_labels (Optional[bool], default: None ) –

    Append labels in addition to probabilities for classification, ignored for regression.

  • include_raw_outputs (Optional[bool], default: None ) –

    Append predictions as margins (in link space) to the prediction CSV.

  • include_shap_values_for_original_features (Optional[bool], default: None ) –

    Append original feature contributions to the prediction CSV.

  • include_shap_values_for_transformed_features (Optional[bool], default: None ) –

    Append transformed feature contributions to the prediction CSV.

  • use_fast_approx_for_shap_values (Optional[bool], default: None ) –

    Speed up prediction contributions with approximation.

Returns:

predict_async

predict_async(
    dataset: Union[Dataset, DataFrame],
    enable_mojo: bool = True,
    include_columns: Optional[List[str]] = None,
    include_labels: Optional[bool] = None,
    include_raw_outputs: Optional[bool] = None,
    include_shap_values_for_original_features: Optional[bool] = None,
    include_shap_values_for_transformed_features: Optional[bool] = None,
    use_fast_approx_for_shap_values: Optional[bool] = None,
) -> PredictionJobs

Launch prediction job on a dataset and return a PredictionJobs object to track the status.

Parameters:

  • dataset (Union[Dataset, DataFrame]) –

    A Dataset or a Pandas DataFrame that can be predicted.

  • enable_mojo (bool, default: True ) –

    Use MOJO (if available) to make predictions.

  • include_columns (Optional[List[str]], default: None ) –

    The list of columns from the dataset to append to the prediction CSV.

  • include_labels (Optional[bool], default: None ) –

    Append labels in addition to probabilities for classification, ignored for regression.

  • include_raw_outputs (Optional[bool], default: None ) –

    Append predictions as margins (in link space) to the prediction CSV.

  • include_shap_values_for_original_features (Optional[bool], default: None ) –

    Append original feature contributions to the prediction CSV.

  • include_shap_values_for_transformed_features (Optional[bool], default: None ) –

    Append transformed feature contributions to the prediction CSV.

  • use_fast_approx_for_shap_values (Optional[bool], default: None ) –

    Speed up prediction contributions with approximation.

Returns:

rename

rename(name: str) -> Experiment

Change experiment display name.

Parameters:

  • name (str) –

    New display name.

Returns:

result

result(silent: bool = False) -> Experiment

Wait for training to complete, then return self.

Parameters:

  • silent (bool, default: False ) –

    If True, do not display status updates.

Returns:

retrain

retrain(
    use_smart_checkpoint: bool = False,
    final_pipeline_only: bool = False,
    final_models_only: bool = False,
    **kwargs: Any
) -> Experiment

Create a new experiment using the same datasets and settings. Through kwargs it's possible to pass new datasets or overwrite settings.

Parameters:

  • use_smart_checkpoint (bool, default: False ) –

    Start the experiment from the last smart checkpoint.

  • final_pipeline_only (bool, default: False ) –

    Trains the final pipeline using smart checkpoint if available, otherwise uses default hyperparameters.

  • final_models_only (bool, default: False ) –

    Trains the final pipeline models (but not transformers) using smart checkpoint if available, otherwise uses default hyperparameters and transformers (overrides final_pipeline_only).

  • kwargs (Any, default: {} ) –

    Datasets and experiment settings as defined in experiments.create().

Returns:

retrain_async

retrain_async(
    use_smart_checkpoint: bool = False,
    final_pipeline_only: bool = False,
    final_models_only: bool = False,
    **kwargs: Any
) -> Experiment

Launch creation of a new experiment using the same datasets and settings. Through kwargs it's possible to pass new datasets or overwrite settings.

Parameters:

  • use_smart_checkpoint (bool, default: False ) –

    Start the experiment from the last smart checkpoint.

  • final_pipeline_only (bool, default: False ) –

    Trains the final pipeline using smart checkpoint if available, otherwise uses default hyperparameters.

  • final_models_only (bool, default: False ) –

    Trains the final pipeline models (but not transformers) using smart checkpoint if available, otherwise uses default hyperparameters and transformers (overrides final_pipeline_only).

  • kwargs (Any, default: {} ) –

    Datasets and experiment settings as defined in experiments.create().

Returns:

status

status(verbose: int = 0) -> str

Returns the status of the job.

Parameters:

  • verbose (int, default: 0 ) –
    • 0: A short description.
    • 1: A short description with a progress percentage.
    • 2: A detailed description with a progress percentage.

Returns:

  • str

    Current status of the job.

to_dict

to_dict() -> Union[Dict, object]

Dump experiment meta data to a python dictionary

Beta API

A beta API that is subject to future changes.

Returns:

transform

transform(
    dataset: Dataset,
    enable_mojo: bool = True,
    include_columns: Optional[List[str]] = None,
    include_labels: Optional[bool] = True,
) -> Transformation

Transform a dataset, then return a Transformation object.

Parameters:

  • dataset (Dataset) –

    A Dataset that can be predicted.

  • enable_mojo (bool, default: True ) –

    Use MOJO (if available) to make transformation.

  • include_columns (Optional[List[str]], default: None ) –

    List of columns from the dataset to append to the prediction CSV.

  • include_labels (Optional[bool], default: True ) –

    Append labels in addition to probabilities for classification, ignored for regression.

Returns:

transform_async

transform_async(
    dataset: Dataset,
    enable_mojo: bool = True,
    include_columns: Optional[List[str]] = None,
    include_labels: Optional[bool] = None,
) -> TransformationJob

Launch transform job on a dataset and return a TransformationJob object to track the status.

Parameters:

  • dataset (Dataset) –

    A Dataset that can be predicted.

  • enable_mojo (bool, default: True ) –

    Use MOJO (if available) to make transformation.

  • include_columns (Optional[List[str]], default: None ) –

    List of columns from the dataset to append to the prediction CSV.

  • include_labels (Optional[bool], default: None ) –

    Append labels in addition to probabilities for classification, ignored for regression.

Driverless AI version requirement

Requires Driverless AI server 1.10.4.1 or higher.

Returns:

variable_importance

variable_importance(
    iteration: int = None, model_index: int = None
) -> Optional[Table]

Get variable importance of an iteration in a Table.

Parameters:

  • iteration (int, default: None ) –

    Zero-based index of the iteration of the experiment.

  • model_index (int, default: None ) –

    The zero-based index of model that was generated in a particular iteration.

Returns:

ExperimentMetricPlots

Interact with the metric plots of an experiment in the Driverless AI server.

actual_vs_predicted_chart property

actual_vs_predicted_chart: Optional[Dict[str, Any]]

Actual vs predicted chart for the model.

Returns:

gains_chart property

gains_chart: Optional[Dict[str, Any]]

Cumulative gains chart for the model.

Returns:

ks_chart property

ks_chart: Optional[Dict[str, Any]]

Kolmogorov-Smirnov chart of the model.

Returns:

lift_chart property

lift_chart: Optional[Dict[str, Any]]

Lift chart of the model.

Returns:

prec_recall_curve property

prec_recall_curve: Optional[Dict[str, Any]]

Precision-recall curve of the model.

Returns:

residual_plot property

residual_plot: Optional[Dict[str, Any]]

Residual plot with LOESS curve of the model.

Returns:

roc_curve property

roc_curve: Optional[Dict[str, Any]]

ROC curve of the model.

Returns:

confusion_matrix

confusion_matrix(threshold: float = None) -> Optional[List[List[Any]]]

Confusion matrix of the model.

Parameters:

  • threshold (float, default: None ) –

    The threshold value.

Returns:

  • Optional[List[List[Any]]]

    A confusion matrix as a 2D list, or None is the model is a classification model

ExperimentArtifacts

Interact with files created by an experiment on the Driverless AI server.

file_paths property

file_paths: Dict[str, str]

Paths to artifact files on the server.

Returns:

create

create(artifact: str) -> None

(Re)build certain artifacts, if possible.

(re)buildable artifacts:

  • 'autodoc'
  • 'mojo_pipeline'
  • 'python_pipeline'

Parameters:

  • artifact (str) –

    The name of the artifact to (re)build.

download

download(
    only: Union[str, List[str]] = None,
    dst_dir: str = ".",
    file_system: Optional[AbstractFileSystem] = None,
    include_columns: Optional[List[str]] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> Dict[str, str]

Download experiment artifacts from the Driverless AI server. Returns a dictionary of relative paths for the downloaded artifacts.

Parameters:

  • only (Union[str, List[str]], default: None ) –

    Specify the specific artifacts to download, use experiment.artifacts.list() to see the available artifacts on the Driverless AI server.

  • dst_dir (str, default: '.' ) –

    The path to the directory where the experiment artifacts will be saved.

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of local file system.

  • include_columns (Optional[List[str]], default: None ) –

    The list of dataset columns to append to prediction CSVs.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

export

export(
    only: Optional[Union[str, List[str]]] = None,
    include_columns: Optional[List[str]] = None,
    **kwargs: Any
) -> Dict[str, str]

Export experiment artifacts from the Driverless AI server. Returns a dictionary of relative paths for the exported artifacts.

Parameters:

  • only (Optional[Union[str, List[str]]], default: None ) –

    Specify the specific artifacts to download, use experiment.artifacts.list() to see the available artifacts on the Driverless AI server.

  • include_columns (Optional[List[str]], default: None ) –

    The list of dataset columns to append to prediction CSVs.

Note

Export location is configured on the Driverless AI server.

Returns:

list

list() -> List[str]

List of experiment artifacts that exist on the Driverless AI server.

Returns:

ExperimentLog

Interact with experiment logs.

file_name property

file_name: str

Filename of the log file.

Returns:

download

download(
    archive: bool = True,
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download experiment logs from the Driverless AI server.

Parameters:

  • archive (bool, default: True ) –

    If available, it is recommended to download an archive that contains multiple log files and stack traces if any were created.

  • dst_dir (str, default: '.' ) –

    The path to the directory where the logs will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the log file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of the local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

head

head(num_lines: int = 50) -> str

Returns the first n lines of the log file.

Parameters:

  • num_lines (int, default: 50 ) –

    Number of lines to retrieve.

Returns:

tail

tail(num_lines: int = 50) -> str

Returns the last n lines of the log file.

Parameters:

  • num_lines (int, default: 50 ) –

    Number of lines to retrieve.

Returns:

PredictionJobs

Monitor the creation of predictions on the Driverless AI server.

included_dataset_columns property

included_dataset_columns: List[str]

Columns from the dataset that are appended to predictions.

Returns:

includes_labels property

includes_labels: bool

Determines whether classification labels are appended to predictions.

Returns:

includes_raw_outputs property

includes_raw_outputs: bool

Whether predictions as margins (in link space) are appended to predictions.

Returns:

includes_shap_values_for_original_features property

includes_shap_values_for_original_features: bool

Whether original feature contributions are appended to predictions.

Returns:

includes_shap_values_for_transformed_features property

includes_shap_values_for_transformed_features: bool

Whether transformed feature contributions are appended to predictions.

Returns:

jobs property

jobs: Sequence[ServerJob]

Monitoring jobs.

Returns:

keys property

keys: Dict[str, str]

Dictionary of the entity unique IDs:

Parameters:

  • Dataset

    The unique ID of dataset used to make predictions.

  • Experiment

    The unique ID of experiments used to make predictions.

  • Prediction

    The unique ID of predictions.

Returns:

used_fast_approx_for_shap_values property

used_fast_approx_for_shap_values: Optional[bool]

Whether approximation was used to calculate prediction contributions.

Returns:

is_complete

is_complete() -> bool

Whether all jobs have been completed successfully.

Returns:

  • bool

    True if all jobs have been completed successfully, otherwise False.

is_running

is_running() -> bool

Whether one or more jobs have been scheduled or is running, finishing, or syncing.

Returns:

  • bool

    True if one or more jobs have not completed yet, otherwise False.

result

result(silent: bool = False) -> Prediction

Wait for the job to complete.

Parameters:

  • silent (bool, default: False ) –

    If True, do not display status updates.

Returns:

status

status(verbose: int = 0) -> List[str]

Returns the statuses of all jobs.

Parameters:

  • verbose (int, default: 0 ) –
    • 0: A short description.
    • 1: A short description with a progress percentage.
    • 2: A detailed description with a progress percentage.

Returns:

  • List[str]

    Current statuses of all jobs.

Prediction

Interact with predictions from the Driverless AI server.

file_paths property

file_paths: List[str]

Paths to the prediction CSV files on the server.

Returns:

included_dataset_columns property

included_dataset_columns: List[str]

Columns from the dataset that are appended to predictions.

Returns:

includes_labels property

includes_labels: bool

Determines whether classification labels are appended to predictions.

Returns:

includes_raw_outputs property

includes_raw_outputs: bool

Determines whether predictions as margins (in link space) were appended to predictions.

Returns:

includes_shap_values_for_original_features property

includes_shap_values_for_original_features: bool

Determines whether original feature contributions are appended to predictions.

Returns:

includes_shap_values_for_transformed_features property

includes_shap_values_for_transformed_features: bool

Determines whether transformed feature contributions are appended to predictions.

Returns:

keys property

keys: Dict[str, str]

Dictionary of unique IDs for entities related to the prediction:

dataset: The unique ID of the dataset used to make predictions. experiment: The unique ID of the experiment used to make predictions. prediction: The unique ID of the predictions.

Returns:

used_fast_approx_for_shap_values property

used_fast_approx_for_shap_values: Optional[bool]

Whether approximation was used to calculate prediction contributions.

Returns:

download

download(
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download CSV of predictions.

Parameters:

  • dst_dir (str, default: '.' ) –

    The path to the directory where the CSV file will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the CSV file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

to_pandas

to_pandas() -> DataFrame

Transfer predictions to a local Pandas DataFrame.

Returns:

Transformation

Interact with transformed data from the Driverless AI server.

file_path property

file_path: str

Paths to the transformed CSV files on the server.

Returns:

included_dataset_columns property

included_dataset_columns: List[str]

Columns from the dataset that are appended to transformed data.

Returns:

includes_labels property

includes_labels: bool

Determines whether classification labels are appended to transformed data.

Returns:

keys property

keys: Dict[str, str]

Dictionary of unique IDs for entities related to the transformed data:

dataset: The unique ID of the dataset used to make predictions. experiment: The unique ID of the experiment used to make predictions. prediction: The unique ID of the predictions.

Returns:

download

download(
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download CSV of transformed data.

Parameters:

  • dst_dir (str, default: '.' ) –

    The path to the directory where the CSV file will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the CSV file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

to_pandas

to_pandas() -> DataFrame

Transfer transformed data to a local Pandas DataFrame.

Returns:

TransformationJob

Monitor the creation of data transformation on the Driverless AI server.

included_dataset_columns property

included_dataset_columns: List[str]

Columns from the dataset that are appended to transformed data.

Returns:

includes_labels property

includes_labels: bool

Determines whether classification labels are appended to transformed data.

Returns:

key property

key: str

Universally unique key of the entity.

Returns:

keys property

keys: Dict[str, str]

Dictionary of the entity unique IDs:

Parameters:

  • Dataset

    The unique ID of dataset used to make predictions.

  • Experiment

    The unique ID of experiments used to make predictions.

  • Prediction

    The unique ID of predictions.

Returns:

name property

name: str

Name of the entity.

Returns:

is_complete

is_complete() -> bool

Whether the job has been completed successfully.

Returns:

  • bool

    True if the job has been completed successfully, otherwise False.

is_running

is_running() -> bool

Whether the job has been scheduled or is running, finishing, or syncing.

Returns:

  • bool

    True if the job has not completed yet, otherwise False.

result

result(silent: bool = False) -> Transformation

Wait for the job to complete, then return self.

Parameters:

  • silent (bool, default: False ) –

    If True, do not display status updates.

Returns:

status

status(verbose: int = None) -> str

Return short job status description string.

Returns:

FitAndTransformation

Interact with fit and transformed data from the Driverless AI server.

fold_column property

fold_column: str

Column that creates the stratified validation split.

Returns:

seed property

seed: int

Random seed that used to start a random generator.

Returns:

test_dataset property

test_dataset: Optional[Dataset]

Test dataset used for this transformation.

Returns:

training_dataset property

training_dataset: Dataset

Training dataset used for this transformation.

Returns:

validation_dataset property

validation_dataset: Optional[Dataset]

Validation dataset used for this transformation.

Returns:

validation_split_fraction property

validation_split_fraction: float

Fraction of data used for validation.

Returns:

download_transformed_test_dataset

download_transformed_test_dataset(
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download fit and transformed test dataset in CSV format.

Parameters:

  • dst_dir (str, default: '.' ) –

    The path to the directory where the CSV file will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the CSV file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

download_transformed_training_dataset

download_transformed_training_dataset(
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download fit and transformed training dataset in CSV format.

Parameters:

  • dst_dir (str, default: '.' ) –

    The path to the directory where the CSV file will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the CSV file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

download_transformed_validation_dataset

download_transformed_validation_dataset(
    dst_dir: str = ".",
    dst_file: Optional[str] = None,
    file_system: Optional[AbstractFileSystem] = None,
    overwrite: bool = False,
    timeout: float = 30,
) -> str

Download fit and transformed validation dataset in CSV format.

Parameters:

  • dst_dir (str, default: '.' ) –

    The path to the directory where the CSV file will be saved.

  • dst_file (Optional[str], default: None ) –

    The name of the CSV file (overrides default file name).

  • file_system (Optional[AbstractFileSystem], default: None ) –

    FSSPEC based file system to download to, instead of local file system.

  • overwrite (bool, default: False ) –

    Overwrite the existing file.

  • timeout (float, default: 30 ) –

    Connection timeout in seconds.

Returns:

FitAndTransformationJob

key property

key: str

Universally unique key of the entity.

Returns:

name property

name: str

Name of the entity.

Returns:

is_complete

is_complete() -> bool

Whether the job has been completed successfully.

Returns:

  • bool

    True if the job has been completed successfully, otherwise False.

is_running

is_running() -> bool

Whether the job has been scheduled or is running, finishing, or syncing.

Returns:

  • bool

    True if the job has not completed yet, otherwise False.

result

result(silent: bool = False) -> FitAndTransformation

Wait for the job to complete, then return self.

Args: silent: If True, do not display status updates.

Returns:

status

status(verbose: int = 0) -> str

Returns the status of the job.

Parameters:

  • verbose (int, default: 0 ) –
    • 0: A short description.
    • 1: A short description with a progress percentage.
    • 2: A detailed description with a progress percentage.

Returns:

  • str

    Current status of the job.