Experiments¶
Experiments ¶
Interact with experiments in the Driverless AI server.
create ¶
create(
train_dataset: Dataset,
target_column: str | None,
task: str,
force: bool = False,
name: str = None,
description: str = None,
**kwargs: Any
) -> Experiment
Creates an experiment in the Driverless AI server.
Parameters:
-
train_dataset
(Dataset
) –Dataset object.
-
target_column
(str | None
) –Name of the column
train_dataset
(passNone
iftask
is'unsupervised'
). -
task
(str
) –One of
'regression'
,'classification'
, or'unsupervised'
. -
force
(bool
, default:False
) –Create a new experiment even if experiment with same name already exists.
-
name
(str
, default:None
) –Display the name of experiment.
-
description
(str
, default:None
) –Description of the experiment. (only available from Driverless AI version 2.0 onwards)
Other Parameters:
-
accuracy
(int
) –Accuracy setting [1-10].
-
time
(int
) –Time setting [1-10].
-
interpretability
(int
) –Interpretability setting [1-10].
-
scorer
(Union[str, ScorerRecipe]
) –Metric to optimize for.
-
models
(Union[str, ModelRecipe]
) –Limit experiments to these models.
-
transformers
(Union[str, TransformerRecipe]
) –Limit experiments to these transformers.
-
validation_dataset
(Dataset
) –Dataset object.
-
test_dataset
(Dataset
) –Dataset object.
-
weight_column
(str
) –Name of the column in
train_dataset
. -
fold_column
(str
) –Name of the column in
train_dataset
-
time_column
(str
) –Name of the column in
train_dataset
, containing time ordering for timeseries problems -
time_groups_columns
(List[str]
) –List of column names, contributing to time ordering.
-
unavailable_at_prediction_time_columns
(List[str]
) –List of column names, which won't be present at prediction time.
-
drop_columns
(List[str]
) –List of column names that need to be dropped.
-
enable_gpus
(bool
) –Allow the usage of GPUs in the experiment.
-
reproducible
(bool
) –Set the experiment to be reproducible.
-
time_period_in_seconds
(int
) –The length of the time period in seconds, used in the timeseries problems.
-
num_prediction_periods
(int
) –Timeseries forecast horizon in time period units.
-
num_gap_periods
(int
) –The number of time periods after which the forecast starts.
-
config_overrides
(str
) –Driverless AI config overrides in TOML string format.
Example: Create an experiment.
client = driverlessai.Client(address='http://localhost:12345',
username='py', password='py')
train_dataset = client.datasets.create(
data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTrain.csv",
data_source="s3",
name="Airlines-data",
description="My airline dataset",
)
test_dataset = client.datasets.create(
data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTest.csv",
data_source="s3",
name="Airlines-data",
description="My airline dataset",
)
experiment = client.experiments.create(
dataset=train_dataset,
target_col=train_dataset.columns[-1],
test_dataset=test_dataset,
task='classification',
scorer='F1',
accuracy=5,
time=5,
interpretability=5,
name='demo_day_experiment'
)
Note
Any expert setting can also be passed as a kwarg
.
To search possible expert settings for your server version,
use experiments.search_expert_settings(search_term)
.
create_async ¶
create_async(
train_dataset: Dataset,
target_column: str | None,
task: str,
force: bool = False,
name: str = None,
description: str | None = None,
**kwargs: Any
) -> Experiment
Launches the creation of an experiment in the Driverless AI server and returns an experiment object to track the experiment status.
Parameters:
-
train_dataset
(Dataset
) –Dataset object.
-
target_column
(str | None
) –The name of column in
train_dataset
(passNone
iftask
is'unsupervised'
). -
task
(str
) –One of
'regression'
,'classification'
, or'unsupervised'
. -
force
(bool
, default:False
) –Create a new experiment even if experiment with same name already exists.
-
name
(str
, default:None
) –The display name for the experiment.
-
description
(str | None
, default:None
) –Description of the experiment. (only available from Driverless AI version 2.0 onwards)
Other Parameters:
-
accuracy
(int
) –Accuracy setting [1-10].
-
time
(int
) –Time setting [1-10].
-
interpretability
(int
) –Interpretability setting [1-10].
-
scorer
(Union[str, ScorerRecipe]
) –Metric to optimize for.
-
models
(Union[str, ModelRecipe]
) –Limit experiments to these models.
-
transformers
(Union[str, TransformerRecipe]
) –Limit experiments to these transformers.
-
validation_dataset
(Dataset
) –Dataset object.
-
test_dataset
(Dataset
) –Dataset object.
-
weight_column
(str
) –Name of the column in
train_dataset
. -
fold_column
(str
) –Name of the column in
train_dataset
-
time_column
(str
) –Name of the column in
train_dataset
, containing time ordering for timeseries problems -
time_groups_columns
(List[str]
) –List of column names, contributing to time ordering.
-
unavailable_at_prediction_time_columns
(List[str]
) –List of column names, which won't be present at prediction time.
-
drop_columns
(List[str]
) –List of column names that need to be dropped.
-
enable_gpus
(bool
) –Allow the usage of GPUs in the experiment.
-
reproducible
(bool
) –Set the experiment to be reproducible.
-
time_period_in_seconds
(int
) –The length of the time period in seconds, used in the timeseries problems.
-
num_prediction_periods
(int
) –Timeseries forecast horizon in time period units.
-
num_gap_periods
(int
) –The number of time periods after which the forecast starts.
-
config_overrides
(str
) –Driverless AI config overrides in TOML string format.
Example: Create an async experiment.
client = driverlessai.Client(address='http://localhost:12345',
username='py', password='py')
train_dataset = client.datasets.create(
data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTrain.csv",
data_source="s3",
name="Airlines-data",
description="My airline dataset",
)
test_dataset = client.datasets.create(
data="s3://h2o-public-test-data/smalldata/airlines/AirlinesTest.csv",
data_source="s3",
name="Airlines-data",
description="My airline dataset",
)
experiment = client.experiments.create_async(
dataset=train_dataset,
target_col=train_dataset.columns[-1],
test_dataset=test_dataset,
task='classification',
scorer='F1',
accuracy=5,
time=5,
interpretability=5,
name='demo_day_experiment'
)
Note
Any expert setting can also be passed as a kwarg
.
To search possible expert settings for your server version,
use experiments.search_expert_settings(search_term)
.
get ¶
get(key: str) -> Experiment
Returns an Experiment object corresponding to an experiment on the Driverless AI server. If the experiment only exists on H2O.ai Storage, it will be imported to the server first.
Parameters:
-
key
(str
) –Driverless AI server's unique ID for the experiment.
get_by_name ¶
get_by_name(name: str) -> Experiment | None
Get the experiment specified by the name.
Parameters:
-
name
(str
) –Name of the experiment.
Returns:
-
Experiment | None
–The experiment with the name if exists, otherwise
None
.
Beta API
A beta API that is subject to future changes.
gui ¶
gui() -> Hyperlink
Returns the complete URL to the experiment details page in the Driverless AI server.
import_dai_file ¶
import_dai_file(
path: str, file_system: AbstractFileSystem | None = None
) -> Experiment
Imports a DAI file to the Driverless AI server and return a corresponding Experiment object.
Parameters:
-
path
(str
) –Path to the
.dai
file. -
file_system
(AbstractFileSystem | None
, default:None
) –The FSSPEC based file system to download from, instead of the local file system.
leaderboard ¶
leaderboard(
train_dataset: Dataset,
target_column: str | None,
task: str,
force: bool = False,
name: str = None,
**kwargs: Any
) -> Project
Launches an experiment leaderboard in the Driverless AI server and return a project object to track experiment statuses.
Parameters:
-
train_dataset
(Dataset
) –Dataset object.
-
target_column
(str | None
) –The name of column in
train_dataset
(passNone
iftask
is'unsupervised'
). -
task
(str
) –One of
'regression'
,'classification'
, or'unsupervised'
. -
force
(bool
, default:False
) –Create a new project even if a project with the same name already exists.
-
name
(str
, default:None
) –The display name for the project.
Other Parameters:
-
accuracy
(int
) –Accuracy setting [1-10].
-
time
(int
) –Time setting [1-10].
-
interpretability
(int
) –Interpretability setting [1-10].
-
scorer
(Union[str, ScorerRecipe]
) –Metric to optimize for.
-
models
(Union[str, ModelRecipe]
) –Limit experiments to these models.
-
transformers
(Union[str, TransformerRecipe]
) –Limit experiments to these transformers.
-
validation_dataset
(Dataset
) –Dataset object.
-
test_dataset
(Dataset
) –Dataset object.
-
weight_column
(str
) –Name of the column in
train_dataset
. -
fold_column
(str
) –Name of the column in
train_dataset
-
time_column
(str
) –Name of the column in
train_dataset
, containing time ordering for timeseries problems -
time_groups_columns
(List[str]
) –List of column names, contributing to time ordering.
-
unavailable_at_prediction_time_columns
(List[str]
) –List of column names, which won't be present at prediction time.
-
drop_columns
(List[str]
) –List of column names that need to be dropped.
-
enable_gpus
(bool
) –Allow the usage of GPUs in the experiment.
-
reproducible
(bool
) –Set the experiment to be reproducible.
-
time_period_in_seconds
(int
) –The length of the time period in seconds, used in the timeseries problems.
-
num_prediction_periods
(int
) –Timeseries forecast horizon in time period units.
-
num_gap_periods
(int
) –The number of time periods after which the forecast starts.
-
config_overrides
(str
) –Driverless AI config overrides in TOML string format.
Returns:
-
Project
–A project object to track experiment statuses.
Note
Any expert setting can also be passed as a kwarg
.
To search possible expert settings for your server version,
use experiments.search_expert_settings(search_term)
.
list ¶
list(start_index: int = 0, count: int = None) -> Sequence[Experiment]
List of Experiment objects available to the user.
Parameters:
Returns:
-
Sequence[Experiment]
–Experiments.
preview ¶
preview(
train_dataset: Dataset,
target_column: str | None,
task: str,
force: bool | None = None,
name: str | None = None,
**kwargs: Any
) -> None
Prints a preview of experiment for the given settings.
Parameters:
-
train_dataset
(Dataset
) –Dataset object.
-
target_column
(str | None
) –The name of column in
train_dataset
(passNone
iftask
is'unsupervised'
). -
task
(str
) –One of
'regression'
,'classification'
, or'unsupervised'
. -
force
(bool | None
, default:None
) –Ignored (
preview
accepts the same arguments ascreate
). -
name
(str | None
, default:None
) –Ignored (
preview
accepts the same arguments ascreate
).
Other Parameters:
-
accuracy
(int
) –Accuracy setting [1-10].
-
time
(int
) –Time setting [1-10].
-
interpretability
(int
) –Interpretability setting [1-10].
-
scorer
(Union[str, ScorerRecipe]
) –Metric to optimize for.
-
models
(Union[str, ModelRecipe]
) –Limit experiments to these models.
-
transformers
(Union[str, TransformerRecipe]
) –Limit experiments to these transformers.
-
validation_dataset
(Dataset
) –Dataset object.
-
test_dataset
(Dataset
) –Dataset object.
-
weight_column
(str
) –Name of the column in
train_dataset
. -
fold_column
(str
) –Name of the column in
train_dataset
-
time_column
(str
) –Name of the column in
train_dataset
, containing time ordering for timeseries problems -
time_groups_columns
(List[str]
) –List of column names, contributing to time ordering.
-
unavailable_at_prediction_time_columns
(List[str]
) –List of column names, which won't be present at prediction time.
-
drop_columns
(List[str]
) –List of column names that need to be dropped.
-
enable_gpus
(bool
) –Allow the usage of GPUs in the experiment.
-
reproducible
(bool
) –Set the experiment to be reproducible.
-
time_period_in_seconds
(int
) –The length of the time period in seconds, used in the timeseries problems.
-
num_prediction_periods
(int
) –Timeseries forecast horizon in time period units.
-
num_gap_periods
(int
) –The number of time periods after which the forecast starts.
-
config_overrides
(str
) –Driverless AI config overrides in TOML string format.
Note
Any expert setting can also be passed as a kwarg
.
To search possible expert settings for your server version,
use experiments.search_expert_settings(search_term)
.
search_expert_settings ¶
Experiment ¶
Interact with an experiment in the Driverless AI server.
artifacts
property
¶
artifacts: ExperimentArtifacts
Artifacts that are created from a completed experiment.
Returns:
creation_timestamp
property
¶
creation_timestamp: float
Creation timestamp in seconds since the epoch (POSIX timestamp).
Returns:
-
float
–
datasets
property
¶
Dictionary of train_dataset
, validation_dataset
, and
test_dataset
used for the experiment.
Example: Get train/valid/test datasets in the experiment.
datasets = experiment.datasets()
train_dataset = datasets["train_dataset"]
validation_dataset = datasets["validation_dataset"]
test_dataset = datasets["test_dataset"]
Returns:
description
property
¶
description: str | None
Description of the experiment.
Driverless AI version requirement
Requires Driverless AI server 2.0 or higher.
Returns:
-
str | None
–
is_deprecated
property
¶
is_deprecated: bool
True
if experiment was created by an old version of
Driverless AI and is no longer fully compatible with the current
server version.
Returns:
-
bool
–
metric_plots
property
¶
metric_plots: ExperimentMetricPlots | None
Metric plots of this model diagnostic.
Beta API
A beta API that is subject to future changes.
Returns:
-
ExperimentMetricPlots | None
–
size
property
¶
size: int
Size in bytes of all experiment's files in the Driverless AI server.
Returns:
-
int
–
summary
property
¶
summary: str | None
An experiment summary that provides a brief overview of the experiment setup and results.
Returns:
-
str | None
–
autodoc ¶
autodoc() -> AutoDoc
Returns the autodoc generated for this experiment. If it has not generated, creates a new autodoc and returns.
compare_settings_with ¶
compare_settings_with(experiment_to_compare_with: Experiment) -> Table
Compares settings of the experiment with another experiment.
Parameters:
-
experiment_to_compare_with
(Experiment
) –The experiment to compare the settings with.
Returns:
compare_setup_with ¶
compare_setup_with(experiment_to_compare_with: Experiment) -> dict[str, Table]
Compares the setup of the experiment with another given experiment.
Parameters:
-
experiment_to_compare_with
(Experiment
) –The experiment to compare the setups with.
export_dai_file ¶
export_dai_file(
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Export the experiment from Driverless AI server in DAI format.
Parameters:
-
dst_dir
(str
, default:'.'
) –The path to the directory where the DAI file will be saved.
-
dst_file
(str | None
, default:None
) –The name of the DAI file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –The FSSPEC based file system to download to, instead of the local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
export_triton_model ¶
export_triton_model(
deploy_predictions: bool = True,
deploy_shapley: bool = False,
deploy_original_shapley: bool = False,
enable_high_concurrency: bool = False,
) -> TritonModelArtifact
Exports the model of this experiment as a Triton model.
Parameters:
-
deploy_predictions
(bool
, default:True
) –whether to deploy model predictions
-
deploy_shapley
(bool
, default:False
) –whether to deploy model Shapley
-
deploy_original_shapley
(bool
, default:False
) –whether to deploy model original Shapley
-
enable_high_concurrency
(bool
, default:False
) –whether to enable handling multiple requests at once
Returns: a triton model
Beta API
A beta API that is subject to future changes.
finish ¶
finish() -> None
Finish experiment by jumping to final pipeline training and generating experiment artifacts.
fit_and_transform ¶
fit_and_transform(
training_dataset: Dataset,
validation_split_fraction: float = 0,
seed: int = 1234,
fold_column: str = None,
test_dataset: Dataset = None,
validation_dataset: Dataset = None,
) -> FitAndTransformation
Transform a dataset, then return a FitAndTransformation object.
Parameters:
-
training_dataset
(Dataset
) –The dataset to be used for refitting the data transformation pipeline.
-
validation_split_fraction
(float
, default:0
) –The fraction of data used for validation.
-
seed
(int
, default:1234
) –A random seed to use to start a random generator.
-
fold_column
(str
, default:None
) –The column to create a stratified validation split.
-
test_dataset
(Dataset
, default:None
) –The dataset to be used for final testing.
-
validation_dataset
(Dataset
, default:None
) –The dataset to be used for tune parameters of models.
fit_and_transform_async ¶
fit_and_transform_async(
training_dataset: Dataset,
validation_split_fraction: float = 0,
seed: int = 1234,
fold_column: str = None,
test_dataset: Dataset = None,
validation_dataset: Dataset = None,
) -> FitAndTransformationJob
Launch transform job on a dataset and return a FitAndTransformationJob object to track the status.
Parameters:
-
training_dataset
(Dataset
) –The dataset to be used for refitting the data transformation pipeline.
-
validation_split_fraction
(float
, default:0
) –The fraction of data used for validation.
-
seed
(int
, default:1234
) –A random seed to use to start a random generator.
-
fold_column
(str
, default:None
) –The column to create a stratified validation split.
-
test_dataset
(Dataset
, default:None
) –The dataset to be used for final testing.
-
validation_dataset
(Dataset
, default:None
) –The dataset to be used for tune parameters of models.
get_linked_projects ¶
Get all the projects that the current experiment belongs to.
Driverless AI version requirement
Requires Driverless AI server 1.10.5 or higher.
get_previous_predictions ¶
get_previous_predictions() -> list[Prediction]
Get all previous predictions of the current experiment.
Beta API
A beta API that is subject to future changes.
Driverless AI version requirement
Requires Driverless AI server 1.11.0 or higher.
gui ¶
gui() -> Hyperlink
Obtains the complete URL for the experiment's page in the Driverless AI server.
is_complete ¶
is_complete() -> bool
Whether the job has been completed successfully.
Returns:
-
bool
–True
if the job has been completed successfully, otherwiseFalse
.
is_running ¶
is_running() -> bool
Whether the job has been scheduled or is running, finishing, or syncing.
Returns:
-
bool
–True
if the job has not completed yet, otherwiseFalse
.
metrics ¶
Return dictionary of experiment scorer metrics and AUC metrics, if available.
notifications ¶
Return list of experiment notification dictionaries.
predict ¶
predict(
dataset: Dataset | DataFrame,
enable_mojo: bool = True,
include_columns: list[str] | None = None,
include_labels: bool | None = None,
include_raw_outputs: bool | None = None,
include_shap_values_for_original_features: bool | None = None,
include_shap_values_for_transformed_features: bool | None = None,
use_fast_approx_for_shap_values: bool | None = None,
) -> Prediction
Predict on a dataset, then return a Prediction object.
Parameters:
-
dataset
(Dataset | DataFrame
) –A Dataset or a Pandas DataFrame that can be predicted.
-
enable_mojo
(bool
, default:True
) –Use MOJO (if available) to make predictions.
-
include_columns
(list[str] | None
, default:None
) –The list of columns from the dataset to append to the prediction CSV.
-
include_labels
(bool | None
, default:None
) –Append labels in addition to probabilities for classification, ignored for regression.
-
include_raw_outputs
(bool | None
, default:None
) –Append predictions as margins (in link space) to the prediction CSV.
-
include_shap_values_for_original_features
(bool | None
, default:None
) –Append original feature contributions to the prediction CSV.
-
include_shap_values_for_transformed_features
(bool | None
, default:None
) –Append transformed feature contributions to the prediction CSV.
-
use_fast_approx_for_shap_values
(bool | None
, default:None
) –Speed up prediction contributions with approximation.
predict_async ¶
predict_async(
dataset: Dataset | DataFrame,
enable_mojo: bool = True,
include_columns: list[str] | None = None,
include_labels: bool | None = None,
include_raw_outputs: bool | None = None,
include_shap_values_for_original_features: bool | None = None,
include_shap_values_for_transformed_features: bool | None = None,
use_fast_approx_for_shap_values: bool | None = None,
) -> PredictionJobs
Launch prediction job on a dataset and return a PredictionJobs object to track the status.
Parameters:
-
dataset
(Dataset | DataFrame
) –A Dataset or a Pandas DataFrame that can be predicted.
-
enable_mojo
(bool
, default:True
) –Use MOJO (if available) to make predictions.
-
include_columns
(list[str] | None
, default:None
) –The list of columns from the dataset to append to the prediction CSV.
-
include_labels
(bool | None
, default:None
) –Append labels in addition to probabilities for classification, ignored for regression.
-
include_raw_outputs
(bool | None
, default:None
) –Append predictions as margins (in link space) to the prediction CSV.
-
include_shap_values_for_original_features
(bool | None
, default:None
) –Append original feature contributions to the prediction CSV.
-
include_shap_values_for_transformed_features
(bool | None
, default:None
) –Append transformed feature contributions to the prediction CSV.
-
use_fast_approx_for_shap_values
(bool | None
, default:None
) –Speed up prediction contributions with approximation.
redescribe ¶
redescribe(description: str) -> Experiment
Change experiment description. Args: description: New description.
Driverless AI version requirement
Requires Driverless AI server 2.0 or higher.
rename ¶
rename(name: str) -> Experiment
Change experiment display name.
Parameters:
-
name
(str
) –New display name.
result ¶
result(silent: bool = False) -> Experiment
Wait for training to complete, then return self.
Parameters:
-
silent
(bool
, default:False
) –If True, do not display status updates.
retrain ¶
retrain(
use_smart_checkpoint: bool = False,
final_pipeline_only: bool = False,
final_models_only: bool = False,
**kwargs: Any
) -> Experiment
Create a new experiment using the same datasets and settings. Through
kwargs
it's possible to pass new datasets or overwrite settings.
Parameters:
-
use_smart_checkpoint
(bool
, default:False
) –Start the experiment from the last smart checkpoint.
-
final_pipeline_only
(bool
, default:False
) –Trains the final pipeline using smart checkpoint if available, otherwise uses default hyperparameters.
-
final_models_only
(bool
, default:False
) –Trains the final pipeline models (but not transformers) using smart checkpoint if available, otherwise uses default hyperparameters and transformers (overrides
final_pipeline_only
). -
kwargs
(Any
, default:{}
) –Datasets and experiment settings as defined in
experiments.create()
.
retrain_async ¶
retrain_async(
use_smart_checkpoint: bool = False,
final_pipeline_only: bool = False,
final_models_only: bool = False,
**kwargs: Any
) -> Experiment
Launch creation of a new experiment using the same datasets and
settings. Through kwargs
it's possible to pass new datasets or
overwrite settings.
Parameters:
-
use_smart_checkpoint
(bool
, default:False
) –Start the experiment from the last smart checkpoint.
-
final_pipeline_only
(bool
, default:False
) –Trains the final pipeline using smart checkpoint if available, otherwise uses default hyperparameters.
-
final_models_only
(bool
, default:False
) –Trains the final pipeline models (but not transformers) using smart checkpoint if available, otherwise uses default hyperparameters and transformers (overrides
final_pipeline_only
). -
kwargs
(Any
, default:{}
) –Datasets and experiment settings as defined in
experiments.create()
.
status ¶
to_dict ¶
Dump experiment meta data to a python dictionary
Beta API
A beta API that is subject to future changes.
transform ¶
transform(
dataset: Dataset,
enable_mojo: bool = True,
include_columns: list[str] | None = None,
include_labels: bool | None = True,
) -> Transformation
Transform a dataset, then return a Transformation object.
Parameters:
-
dataset
(Dataset
) –A Dataset that can be predicted.
-
enable_mojo
(bool
, default:True
) –Use MOJO (if available) to make transformation.
-
include_columns
(list[str] | None
, default:None
) –List of columns from the dataset to append to the prediction CSV.
-
include_labels
(bool | None
, default:True
) –Append labels in addition to probabilities for classification, ignored for regression.
transform_async ¶
transform_async(
dataset: Dataset,
enable_mojo: bool = True,
include_columns: list[str] | None = None,
include_labels: bool | None = None,
) -> TransformationJob
Launch transform job on a dataset and return a TransformationJob object to track the status.
Parameters:
-
dataset
(Dataset
) –A Dataset that can be predicted.
-
enable_mojo
(bool
, default:True
) –Use MOJO (if available) to make transformation.
-
include_columns
(list[str] | None
, default:None
) –List of columns from the dataset to append to the prediction CSV.
-
include_labels
(bool | None
, default:None
) –Append labels in addition to probabilities for classification, ignored for regression.
Driverless AI version requirement
Requires Driverless AI server 1.10.4.1 or higher.
variable_importance ¶
ExperimentMetricPlots ¶
Interact with the metric plots of an experiment in the Driverless AI server.
actual_vs_predicted_chart
property
¶
Actual vs predicted chart for the model.
Returns:
-
dict[str, Any] | None
–An actual vs predicted chart in Vega Lite (v3) format, or
None
is the model is a classification model.
gains_chart
property
¶
Cumulative gains chart for the model.
Returns:
-
dict[str, Any] | None
–A cumulative gains chart in Vega Lite (v3) format, or
None
is the model is a classification model.
ks_chart
property
¶
Kolmogorov-Smirnov chart of the model.
Returns:
-
dict[str, Any] | None
–A Kolmogorov-Smirnov chart in Vega Lite (v3) format, or
None
if the model is not a classification model.
lift_chart
property
¶
Lift chart of the model.
Returns:
-
dict[str, Any] | None
–A lift chart in Vega Lite (v3) format, or
None
is the model is a classification model.
prec_recall_curve
property
¶
Precision-recall curve of the model.
Returns:
-
dict[str, Any] | None
–A precision-recall curve in Vega Lite (v3) format, or
None
is the model is a classification model.
residual_plot
property
¶
Residual plot with LOESS curve of the model.
Returns:
-
dict[str, Any] | None
–A residual plot in Vega Lite (v3) format, or
None
is the model is a classification model.
roc_curve
property
¶
ROC curve of the model.
Returns:
-
dict[str, Any] | None
–A ROC curve in Vega Lite (v3) format, or
None
is the model is a classification model
confusion_matrix ¶
ExperimentArtifacts ¶
Interact with files created by an experiment in the Driverless AI server.
file_paths
property
¶
create ¶
create(artifact: str) -> None
(Re)build certain artifacts, if possible.
(re)buildable artifacts:
'autodoc'
'mojo_pipeline'
'python_pipeline'
Parameters:
-
artifact
(str
) –The name of the artifact to (re)build.
download ¶
download(
only: str | list[str] = None,
dst_dir: str = ".",
file_system: AbstractFileSystem | None = None,
include_columns: list[str] | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> dict[str, str]
Download experiment artifacts from the Driverless AI server. Returns a dictionary of relative paths for the downloaded artifacts.
Parameters:
-
only
(str | list[str]
, default:None
) –Specify the specific artifacts to download, use
experiment.artifacts.list()
to see the available artifacts in the Driverless AI server. -
dst_dir
(str
, default:'.'
) –The path to the directory where the experiment artifacts will be saved.
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of local file system.
-
include_columns
(list[str] | None
, default:None
) –The list of dataset columns to append to prediction CSVs.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
export ¶
export(
only: str | list[str] | None = None,
include_columns: list[str] | None = None,
**kwargs: Any
) -> dict[str, str]
Export experiment artifacts from the Driverless AI server. Returns a dictionary of relative paths for the exported artifacts.
Parameters:
Note
Export location is configured in the Driverless AI server.
ExperimentLog ¶
Interact with experiment logs.
download ¶
download(
archive: bool = True,
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Download experiment logs from the Driverless AI server.
Parameters:
-
archive
(bool
, default:True
) –If available, it is recommended to download an archive that contains multiple log files and stack traces if any were created.
-
dst_dir
(str
, default:'.'
) –The path to the directory where the logs will be saved.
-
dst_file
(str | None
, default:None
) –The name of the log file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of the local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
PredictionJobs ¶
Monitor the creation of predictions in the Driverless AI server.
included_dataset_columns
property
¶
includes_labels
property
¶
includes_labels: bool
Determines whether classification labels are appended to predictions.
Returns:
-
bool
–
includes_raw_outputs
property
¶
includes_raw_outputs: bool
Whether predictions as margins (in link space) are appended to predictions.
Returns:
-
bool
–
includes_shap_values_for_original_features
property
¶
includes_shap_values_for_original_features: bool
Whether original feature contributions are appended to predictions.
Returns:
-
bool
–
includes_shap_values_for_transformed_features
property
¶
includes_shap_values_for_transformed_features: bool
Whether transformed feature contributions are appended to predictions.
Returns:
-
bool
–
keys
property
¶
used_fast_approx_for_shap_values
property
¶
used_fast_approx_for_shap_values: bool | None
Whether approximation was used to calculate prediction contributions.
Returns:
-
bool | None
–
is_complete ¶
is_complete() -> bool
Whether all jobs have been completed successfully.
Returns:
-
bool
–True
if all jobs have been completed successfully, otherwiseFalse
.
is_running ¶
is_running() -> bool
Whether one or more jobs have been scheduled or is running, finishing, or syncing.
Returns:
-
bool
–True
if one or more jobs have not completed yet, otherwiseFalse
.
result ¶
result(silent: bool = False) -> Prediction
Waits for the job to complete.
Parameters:
-
silent
(bool
, default:False
) –If True, do not display status updates.
Returns:
-
Prediction
–The Prediction job results.
status ¶
Prediction ¶
Interact with predictions from the Driverless AI server.
file_paths
property
¶
included_dataset_columns
property
¶
includes_labels
property
¶
includes_labels: bool
Determines whether classification labels are appended to predictions.
Returns:
-
bool
–
includes_raw_outputs
property
¶
includes_raw_outputs: bool
Determines whether predictions as margins (in link space) were appended to predictions.
Returns:
-
bool
–
includes_shap_values_for_original_features
property
¶
includes_shap_values_for_original_features: bool
Determines whether original feature contributions are appended to predictions.
Returns:
-
bool
–
includes_shap_values_for_transformed_features
property
¶
includes_shap_values_for_transformed_features: bool
Determines whether transformed feature contributions are appended to predictions.
Returns:
-
bool
–
keys
property
¶
used_fast_approx_for_shap_values
property
¶
used_fast_approx_for_shap_values: bool | None
Whether approximation was used to calculate prediction contributions.
Returns:
-
bool | None
–
download ¶
download(
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Downloads the predictions of the experiment in CSV format.
Parameters:
-
dst_dir
(str
, default:'.'
) –The path to the directory where the CSV file will be saved.
-
dst_file
(str | None
, default:None
) –The name of the CSV file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
Transformation ¶
Interact with transformed data from the Driverless AI server.
file_path
property
¶
file_path: str
Paths to the transformed CSV files on the server.
Returns:
-
str
–
included_dataset_columns
property
¶
includes_labels
property
¶
includes_labels: bool
Determines whether classification labels are appended to transformed data.
Returns:
-
bool
–
keys
property
¶
download ¶
download(
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Downloads a CSV of transformed data.
Parameters:
-
dst_dir
(str
, default:'.'
) –The path to the directory where the CSV file will be saved.
-
dst_file
(str | None
, default:None
) –The name of the CSV file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
TransformationJob ¶
Monitor the creation of data transformation in the Driverless AI server.
included_dataset_columns
property
¶
includes_labels
property
¶
includes_labels: bool
Determines whether classification labels are appended to transformed data.
Returns:
-
bool
–
keys
property
¶
is_complete ¶
is_complete() -> bool
Whether the job has been completed successfully.
Returns:
-
bool
–True
if the job has been completed successfully, otherwiseFalse
.
is_running ¶
is_running() -> bool
Whether the job has been scheduled or is running, finishing, or syncing.
Returns:
-
bool
–True
if the job has not completed yet, otherwiseFalse
.
result ¶
result(silent: bool = False) -> Transformation
Waits for the job to complete, then returns self.
Parameters:
-
silent
(bool
, default:False
) –If True, do not display status updates.
FitAndTransformation ¶
Interact with fit and transformed data from the Driverless AI server.
fold_column
property
¶
fold_column: str
Column that creates the stratified validation split.
Returns:
-
str
–
test_dataset
property
¶
test_dataset: Dataset | None
Test dataset used for this transformation.
Returns:
-
Dataset | None
–
training_dataset
property
¶
training_dataset: Dataset
Training dataset used for this transformation.
Returns:
-
Dataset
–
validation_dataset
property
¶
validation_dataset: Dataset | None
Validation dataset used for this transformation.
Returns:
-
Dataset | None
–
validation_split_fraction
property
¶
validation_split_fraction: float
Fraction of data used for validation.
Returns:
-
float
–
download_transformed_test_dataset ¶
download_transformed_test_dataset(
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Download fit and transformed test dataset in CSV format.
Parameters:
-
dst_dir
(str
, default:'.'
) –The path to the directory where the CSV file will be saved.
-
dst_file
(str | None
, default:None
) –The name of the CSV file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
download_transformed_training_dataset ¶
download_transformed_training_dataset(
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Download fit and transformed training dataset in CSV format.
Parameters:
-
dst_dir
(str
, default:'.'
) –The path to the directory where the CSV file will be saved.
-
dst_file
(str | None
, default:None
) –The name of the CSV file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
download_transformed_validation_dataset ¶
download_transformed_validation_dataset(
dst_dir: str = ".",
dst_file: str | None = None,
file_system: AbstractFileSystem | None = None,
overwrite: bool = False,
timeout: float = 30,
) -> str
Download fit and transformed validation dataset in CSV format.
Parameters:
-
dst_dir
(str
, default:'.'
) –The path to the directory where the CSV file will be saved.
-
dst_file
(str | None
, default:None
) –The name of the CSV file (overrides default file name).
-
file_system
(AbstractFileSystem | None
, default:None
) –FSSPEC based file system to download to, instead of local file system.
-
overwrite
(bool
, default:False
) –Overwrite the existing file.
-
timeout
(float
, default:30
) –Connection timeout in seconds.
FitAndTransformationJob ¶
is_complete ¶
is_complete() -> bool
Whether the job has been completed successfully.
Returns:
-
bool
–True
if the job has been completed successfully, otherwiseFalse
.
is_running ¶
is_running() -> bool
Whether the job has been scheduled or is running, finishing, or syncing.
Returns:
-
bool
–True
if the job has not completed yet, otherwiseFalse
.
result ¶
result(silent: bool = False) -> FitAndTransformation
Wait for the job to complete, then return self.
Args: silent: If True, do not display status updates.