Driverless AI provides a Project Workspace for managing datasets and experiments related to a specific business problem or use case. Whether you are trying to detect fraud or predict user retention, datasets and experiments can be stored and saved in the individual projects. A Leaderboard on the Projects page lets you easily compare performance and results and identify the best solution for your problem.
The following sections describe how to create and manage projects.
For information on remote storage and importing datasets and experiments through the Projects page, see H2O Storage (remote storage) integration.
For information on how to export Driverless AI experiments to H2O MLOps from the Projects page, see the official H2O MLOps documentation.
Projects listing page options¶
The following is a list of options that are available from the Projects listing page.
Open: Open the Project page for the project.
Rename: Rename the project.
Edit description: Edit the description for the project.
Share: Share the project with other users. For more information, see Sharing With Other Users.
Go to MLOps: View the project in the MLOps Wave app.
Delete: Delete the project.
Creating a Project Workspace¶
To create a Project Workspace:
Click the Projects option on the top menu. The Projects listing page is displayed.
Click New Project.
Specify a name for the project and provide a description.
Click Create Project. This creates an empty Project page.
From the Project page, you can link datasets and/or experiments, run new experiments, and score experiments on a scoring dataset. When you link an existing experiment to a Project, the datasets used for the experiment are automatically linked to the project (if not already linked).
When attempting to solve a business problem, a normal workflow will include running multiple experiments, either with different/new data or with a variety of settings, and the optimal solution can vary for different users and/or business problems. For some users, the model with the highest accuracy for validation and test data could be the most optimal one. Other users might be willing to make an acceptable compromise on the accuracy of the model for a model with greater performance (faster prediction). For some, it could also mean how quickly the model could be trained with acceptable levels of accuracy. The Experiments list allows you to find the best solution for your business problem.
The list is organized based on experiment name. You can change the sorting of experiments by selecting the up/down arrows beside a column heading in the experiment menu.
Hover over the right menu of an experiment to view additional information about the experiment, including the problem type, datasets used, and the target column.
Finished experiments linked to the project show their validation and test scores. You can also score experiments on other datasets. To do this, you first need to add a dataset by clicking the Link Dataset button and choosing Testing from the drop-down menu. After the test dataset has been added, click the Score on Scoring Data button and choose the experiment(s) that you want to score along with the test dataset to be applied. This triggers a diagnostics job, the results of which are located on the diagnostics page. (Refer to Diagnosing a Model for more information.) After the scoring process has completed, the result appears in the Score and Scoring Time columns. The Score column shows results for the scorer specified by the Show Results for Scorer picker.
If an experiment has already been scored on a dataset, Driverless AI cannot score it again. The scoring step is deterministic, so for a particular test dataset and experiment combination, the score will be same regardless of how many times you repeat it.
The test dataset must have all the columns that are expected by the various experiments you are scoring it on. However, the columns of the test dataset need not be exactly the same as input features expected by the experiment. There can be additional columns in the test dataset. If these columns were not used for training, they will be ignored. This feature gives you the ability to train experiments on different training datasets (i.e., having different features), and if you have an “uber test dataset” that includes all these feature columns, then you can use the same dataset to score these experiments.
A Test Time column is available in the Experiments Leaderboard. This value shows the total time (in seconds) that it took for calculating the experiment scores for all applicable scorers for the experiment type. This is valuable to users who need to estimate the runtime performance of an experiment.
You can compare two or three experiments and view side-by-side detailed information about each.
Select either two or three experiments that you want to compare. You cannot compare more than three experiments.
Click the Compare n Items button.
This opens the Compare Experiments page. This page includes the experiment summary and metric plots for each experiment. The metric plots vary depending on whether this is a classification or regression experiment.
For classification experiments, this page includes:
Variable Importance list
Precision Recall Curve
For regression experiments, this page includes:
Variable Importance list
Actual vs. Predicted Graph
The following steps describe how to add tags to experiments that have been linked to a Project.
To use the tagging functionality, the experiment needs to be linked to H2O Storage (remote storage).
An experiment can only be tagged after it has completed.
Right click an experiment listed on the Project page, and then click Manage Tags. The Tags panel is displayed.
Click the Add New Tag button.
Enter the tag name and click the Save button. The new tag appears in the list of Added Tags in the Tags panel.
Note that tags that exist within the Project but have not been applied to a given experiment are listed in the Tags panel as Project Tags. This means that you can add existing tags to different experiments without having to add them multiple times.
To delete a project, click the Projects option on the top menu to open the main Projects page. Click the dotted menu the right-most column, and then select Delete. You will be prompted to confirm the deletion.
Note that deleting projects does not delete datasets and experiments from Driverless AI. Any datasets and experiments from deleted projects will still be available on the Datasets and Experiments pages.
Leaderboard Wizard: Business value calculator¶
From the Project page, you can access a business value calculator wizard by clicking the Analyze Results button. This wizard makes it simple to perform a business value analysis for all models in a given project. Note that this feature is only supported for classification experiments.
The Leaderboard Wizard lets you assign the business value of correctly and incorrectly predicted outcomes. By default, a correct classification is worth 1, and an incorrect classification is worth -1 (in arbitrary units), but you can edit these values as needed.
For leaderboards of experiments, experiments are ranked by the net business value they provide on the test set.