Skip to main content
Version: v0.8.4

Tutorial 1C: Scikit-Learn

Overview

This tutorial walks through the process of how you can generate an AutoDoc for a model built in Scikit-Learn. To generate an AutoDoc for a built model in Scikit-Learn:

Prerequisites

  • Knowledge of Scikit-Learn

Step 1: Scikit-Learn model

To build an AutoDoc for a supervised learning model, built-in Scikit-Learn, you need to download its model first. As a requirement, H2O AutoDoc requires the Scikit-Learn model to be in a .pkl file format (preferably in a Joblib format). To learn more, see Model persistence: Python specific serialization.

Step 2: AutoDoc Settings

  1. In H2O AutoDoc, click Create new AutoDoc.

    Name
  2. In the New report list, select From Scikit model.

  3. In the Report name box, enter a name for the AutoDoc (e.g., Scikit-Learn AutoDoc).

    Name
  4. To upload your Scikit-Learn model, click Browse....

    Name
  5. After uploading you model, click Upload Scikit model.

  6. Click Next: Upload training & validation data.

  7. Click Upload training data.

    Note

    Anytime you are preparing the settings for an AutoDoc for a built model in Scikit-Learn, you need to upload the training dataset used to build the model.

  8. Click Browse....

  9. After uploading the train dataset, click Upload training data.

  10. In the Select the target column used while training list, select the model's target column.

    Name
  11. Click Next: Upload test data.

  12. Click Skip test data.

    Note
    • For purposes of this tutorial, we will not upload the test dataset of the built model in Scikit-Learn.
    • Anytime you want to generate an AutoDoc for a built model in Scikit-Learn, you don't need to provide a test or used test dataset.
    • Not providing a test dataset will lead to the AutoDoc (report) not containing an overview of the validation dataset.
  13. Click Create AutoDoc.

    Name

Feedback