Steam makes use of project-based machine learning. Whether you are trying to detect fraud or predict user retention, the datasets, models, and test results are stored and saved in the individual projects. All Steam users within your environment can access these projects and the files within them.
The Steam Projects page includes additional subnavigation items for Models, Deployment, Configuration, and Collaborators. Each of these pages is described in later sections.
Note: You can use the trashcan icon to delete a project, but you cannot delete projects that include a model. Delete the model first, then delete the project.
Creating a Project¶
Before you can create a project, be sure that H2O is running on an available cluster.
- To start a new project, click the Start A New Project button on the Welcome page.
- When you first log in to Steam, the list of clusters will be empty. Enter the IP address of the cluster that is running H2O, then click Connect.
- Once connected, the current list of clusters will immediately populate with the cluster’s information. Click Connect beside this cluster to continue.
- Select an available H2O frame from the Datasets dropdown, then select the Model Category. Note that these dropdowns are automatically populated with information from datasets that are available on the selected cluster. If no datasets are available, then the dropdown lists will be empty. For clusters that contain datasets, after a dataset is selected, a list of corresponding model will display.
- Select the checkbox beside the model(s) to import into the Steam project.
- Specify a name for the project.
- Click Create Project when you are done. Upon successful completion, the Models page will be populated with the model(s) that you added to your project, and the new project will be available on the Projects page. On the Projects page, click on the newly created project. This opens a submenu allowing you to view the imported models, deployed models, and configurations specific to that project. Information about these topics is available in the sections that follow.
The Models page shows a list of all models included in a selected Project. This list also includes summary information for each model. This information varies based on whether the model is binomial or regresssion.
For binomial models, the following values will display on the Models page.
For regression models, the following values will display on the Models page.
You can perform the following actions directly from this page:
Import a new model
View model details and export the model as a java, jar, or war file
Label a model (Refer to Configurations for information on how to create labels.)
Deploy the model
Delete a model. Note that all models in a project must be deleted before you can delete a project.
Note: The Models page lists models in alphabetical order and shows up to five models per page. If your project includes more than five models, use the forward and back arrows at the bottom of the page to view more models.
After models are added to an H2O cluster, they can be imported into an existing Steam project. In the upper-right corner of the Models page, click the Import Models button. This opens an Import Models popup form.
The Cluster dropdown automatically populates with a list H2O clusters. Specify the H2O cluster that has the models you want to import, then select the additional model or models that you want to add to the project.
Click Import when you are done. The newly added models will then appear on the Models page.
Viewing Model Details¶
On the Models page, click the view model details link under the Action column for the model that you want to view.
This page provides information about when the model was created, the algorithm and dataset used to create the model, and the response column specified when the model was built. The Goodness of Fit section provides value information for the model, including the Mean Squared Error, LogLoss, R^2, AUC, and Gini score. An ROC curve is available for binomial models.
From this page, you can perform the following actions:
- While viewing model details, click the Compared To field. This opens a popup showing all models available in the current project.
- Select to compare the current model with any available model. This example compares a GLM model with a GBM model. Once a model is selected, the Model Details page immediately populates with the comparison information. The current model values are displayed in blue, and the selected comparison model displays in orange.
Deploying a Model¶
After comparing models, you might decide to deploy one or more of the best models. Perform the steps below to deploy a model.
- While viewing the model details, click the Deploy Model button. (Note that this can also be done directly from the Models page by selecting the deploy model link in the Action column.)
- Specify a service name for the deployment.
- To perform pre-processing on the model, specify a Preprocessing Script. Note that this dropdown is populated with scripts that are added to the project. Information about adding preprocessing scripts is available in the Deployment section.
- Click Deploy when you are done.
- Upon successful completion, a scoring service will be created for this deployed model. Click the Deployment menu option on the left navigation to go to the Deployment page. Refer to the Deployment section for more information.
Exporting a Model¶
Steam allows you to export models to your local machine.
- While viewing the model details, click the Export Model button.
- Specify whether to export the model as a .java, .jar, or .war file.
- To perform pre-processing on the model during the export, specify a Preprocessing Script. Note that this dropdown is populated with scripts that are added to the project. Information about adding preprocessing scripts is available in the Deployment section.
- Click Download when you are done.
The Deployment page lists all available deployed services. For each deployed service, this page shows the model name, model ID, and the status. You can stop a running service by clicking the Stop Service button.
In addition to showing deployed services, a Packaging tab is available showing the preprocessing packages used in the deployment.
Uploading a New Package¶
Preprocessing packages can be used to perform additional data munging on an existing model.
- To upload a new preprocessing package, click the Upload New Package button in the upper-right corner of the Deployment page.
- Specify the main Python file that will be used for preprocessing. Click on the folder link to browse for this file.
- Specify additional files that may be dependencies of the main Python preprocessing file.
- If you are running in a conda environment, you can select a .yaml file that defines the environment.
- Enter a name for this new package.
- Click Upload when you are finished.
Upon successful completion, the new preprocessing package will display on the Packages tab of the Deployment page. This file can then be specified when deploying or exporting models. (Refer to Deploying a Model or Exporting a Model.)
- To reach the Steam Prediction Service Buider, click the IP address link listed under the Deployed Services for the model that you have deployed and want to score. Clicking this link opens the Steam Prediction Service Builder. (Refer to the Prediction Service Builder appendix for more information.) The fields that display on the Prediction Service Builder are automatically populated with field information from the deployed model, making it easy for you to make predictions based on any model that you deploy.
- Make predictions by specifying input values based on column data from the original dataset. This automatically populates the fields in the query string. (Note that you can optionally include input parameters directly in the query string instead of specifying parameters.)
- Click Predict when you are done.
- Use the Clear button to clear all entries and begin a new prediction.
- You can optionally open a batch JSON file to perform batch predictions.
- Use the More Stats button to view additional statistics about the scoring service results.
- The Steam Prediction Service is available as a completely standalone utility. Refer to the Prediction Service Builder appendix for more information about using the Prediction Service Builder.
When maintaining and storing models in Steam, it is useful to know whether the version of a model that you’re viewing is used for testing, development, production, etc. Steam allows admins to set labels (or versioning) for models and to apply permissions for those models using the labels. The Steam admin is responsible for creating new Steam users and setting roles and workgroups for those users. When setting Steam project configurations, labels can be created that allow, for example, only users in a Production workgroup to label a model as a Deployment model.
When a label is applied to a model, the Project Configurations page will show all models associated with a label.
Creating a New Label¶
- On the Configurations page, click the Create New Label button.
- Enter a unique name for the label, then provide a description.
- Click Save when you are done.
Upon successful completion, the new label will display on the Project Configurations page and can be edited or deleted. This label will also be available on the Models page in the label as dropdown. The following image shows two labels in the label as dropdown: deploy and test.
The Collaborators page shows the users who have been added to the Steam database as well as the Labels Access (permissions) assigned to each user. Currently, users can only be added by the Steam admin using the CLI.