Driverless AI: Credit Card Demo¶
This notebook provides an H2OAI Client workflow, of model building and scoring, that parallels the Driverless AI workflow.
Here is the Python Client Documentation.
Workflow Steps¶
Build an Experiment with Python API:
Sign in
Import train & test set/new data
Specify experiment parameters
Launch Experiement
Examine Experiment
Download Predictions
Build an Experiment in Web UI and Access Through Python:
Get pointer to experiment
Score on New Data:
Score on new data with H2OAI model
Model Diagnostics on New Data:
Run model diagnostincs on new data with H2OAI model
Run Model Interpretation
Run model interpretation on the raw features
Run Model Interpretation on External Model Predictions
Build Scoring Pipelines
Build Python Scoring Pipeline
Build MOJO Scoring Pipeline
Build an Experiment with Python API¶
1. Sign In¶
Import the required modules and log in.
Pass in your credentials through the Client class which creates an authentication token to send to the Driverless AI Server. In plain English: to sign into the Driverless AI web page (which then sends requests to the Driverless Server), instantiate the Client class with your Driverless AI address and login credentials.
[1]:
import driverlessai
import matplotlib.pyplot as plt
import pandas as pd
[2]:
address = 'http://ip_where_driverless_is_running:12345'
username = 'username'
password = 'password'
dai = driverlessai.Client(address = address, username = username, password = password)
# make sure to use the same user name and password when signing in through the GUI
Equivalent Steps in Driverless: Signing In¶
2. Upload Datasets¶
Upload training and testing datasets from the Driverless AI /data folder.
You can provide a training, validation, and testing dataset for an experiment. The validation and testing dataset are optional. In this example, we will provide only training and testing.
[3]:
train_path = 's3://h2o-public-test-data/smalldata/kaggle/CreditCard/creditcard_train_cat.csv'
test_path = 's3://h2o-public-test-data/smalldata/kaggle/CreditCard/creditcard_test_cat.csv'
train = dai.datasets.create(data=train_path, data_source='s3')
test = dai.datasets.create(data=test_path, data_source='s3')
Complete 100.00% - [4/4] Computing column statistics
Complete 100.00% - [4/4] Computing column statistics
Equivalent Steps in Driverless: Uploading Train & Test CSV Files¶
3. Set Experiment Parameters¶
We will now set the parameters of our experiment. Some of the parameters include:
Target Column: The column we are trying to predict.
Dropped Columns: The columns we do not want to use as predictors such as ID columns, columns with data leakage, etc.
Weight Column: The column that indicates the per row observation weights. If
None
, each row will have an observation weight of 1.Fold Column: The column that indicates the fold. If
None
, the folds will be determined by Driverless AI.Is Time Series: Whether or not the experiment is a time-series use case.
For information on the experiment settings, refer to the Experiment Settings.
For this example, we will be predicting ``default payment next month``. The parameters that control the experiment process are: accuracy
, time
, and interpretability
. We can use the experiments.preview
function to get a sense of what will happen during the experiment.
We will start out by seeing what the experiment will look like with accuracy
, time
, and interpretability
all set to 5.
[4]:
target = "DEFAULT_PAYMENT_NEXT_MONTH"
exp_preview = dai.experiments.preview(train_dataset=train,
task='classification',
target_column=target,
enable_gpus=False,
accuracy=5,
time=5,
interpretability=5,
config_overrides=None)
exp_preview
ACCURACY [5/10]:
- Training data size: *23,999 rows, 25 cols*
- Feature evolution: *[Constant, LightGBM, XGBoostGBM]*, *1/4 validation split*
- Final pipeline: *Ensemble (8 models), 4-fold CV*
TIME [5/10]:
- Feature evolution: *4 individuals*, up to *66 iterations*
- Early stopping: After *10* iterations of no improvement
INTERPRETABILITY [5/10]:
- Feature pre-pruning strategy: None
- Monotonicity constraints: disabled
- Feature engineering search space: [CVCatNumEncode, CVTargetEncode, CatOriginal, Cat, ClusterDist, ClusterTE, Frequent, Interactions, NumCatTE, NumToCatTE, NumToCatWoE, Original, TextLinModel, Text, TruncSVDNum, WeightOfEvidence]
[Constant, LightGBM, XGBoostGBM] models to train:
- Model and feature tuning: *16*
- Feature evolution: *104*
- Final pipeline: *8*
Estimated runtime: *minutes*
Auto-click Finish/Abort if not done in: *1 day*/*7 days*
With these settings, the Driverless AI experiment will train about 124 models: * 16 for model and feature tuning * 104 for feature evolution * 8 for the final pipeline
When we start the experiment, we can either:
specify parameters
use Driverless AI to suggest parameters (occurs automatically behind the scenes)
Driverless AI has found that the best parameters are to set ``accuracy = 5``, ``time = 5``, ``interpretability = 5``. It has selected ``AUC`` as the scorer (this is the default scorer for binomial problems).
Equivalent Steps in Driverless: Set the Knobs, Configuration & Launch¶
4. Launch Experiment: Feature Engineering + Final Model Training¶
We can launch the experiment with the suggested parameters or create our own.
[5]:
ex = dai.experiments.create(train_dataset=train,
test_dataset=test,
target_column=target,
task='classification',
accuracy=5,
time=5,
interpretability=5,
scorer="AUC",
enable_gpus=True,
seed=1234,
cols_to_drop=['ID'])
Experiment launched at: http://localhost:12345/#experiment?key=5fa1d9b2-f2a2-11ea-8ad5-0242ac110002
Complete 100.00% - Status: Complete
Equivalent Steps in Driverless: Launch Experiment¶
5. Examine Experiment¶
View the final model score for the validation and test datasets. When feature engineering is complete, an ensemble model can be built depending on the accuracy setting. The experiment object also contains the score on the validation and test data for this ensemble model. In this case, the validation score is the score on the training cross-validation predictions.
[6]:
metrics = ex.metrics()
print("Final model Score on Validation Data: " + str(round(metrics['val_score'], 3)))
Final model Score on Validation Data: 0.779
6. Download Results¶
Once an experiment is complete, we can see that the UI presents us options of downloading the:
predictions
on the (holdout) train data
on the test data
experiment summary - summary of the experiment including feature importance
We will show an example of downloading the test predictions below. Note that equivalent commands can also be run for downloading the train (holdout) predictions.
[7]:
ex.artifacts.download(only=['test_predictions'], dst_dir='', overwrite=True,)
Downloaded 'test_preds.csv'
[7]:
{'test_predictions': 'test_preds.csv'}
[8]:
test_preds = pd.read_csv("./test_preds.csv")
test_preds.head()
[8]:
DEFAULT_PAYMENT_NEXT_MONTH.0 | DEFAULT_PAYMENT_NEXT_MONTH.1 | |
---|---|---|
0 | 0.719911 | 0.280089 |
1 | 0.785363 | 0.214637 |
2 | 0.797846 | 0.202154 |
3 | 0.751900 | 0.248100 |
4 | 0.784983 | 0.215017 |
Build an Experiment in Web UI and Access Through Python¶
It is also possible to use the Python API to examine an experiment that was started through the Web UI using the experiment key.
1. Get pointer to experiment¶
You can get a pointer to the experiment by referencing the experiment key from the Web UI.
[9]:
# Get list of experiments
experiment_list = dai.experiments.list()
experiment_list
[9]:
[<class 'driverlessai._experiments.Experiment'> 5fa1d9b2-f2a2-11ea-8ad5-0242ac110002 kupedapa]
[10]:
# Get pointer to experiment
exp = experiment_list[0]
exp
[10]:
<class 'driverlessai._experiments.Experiment'> 5fa1d9b2-f2a2-11ea-8ad5-0242ac110002 kupedapa
Score on New Data¶
You can use the Python API to score on new data. This is equivalent to the SCORE ON ANOTHER DATASET button in the Web UI. The example below scores on the test data and then downloads the predictions.
Pass in any dataset that has the same columns as the original training set. If you passed a test set during the H2OAI model building step, the predictions already exist.
1. Score Using the H2OAI Model¶
The following shows the predicted probability of default for each record in the test.
[11]:
# Get unlabeled data to make predictions on
data = dai.datasets.create(data='s3://h2o-public-test-data/smalldata/kaggle/CreditCard/creditcard_test_cat.csv',
data_source='s3',
force=True)
target = "DEFAULT_PAYMENT_NEXT_MONTH"
prediction = ex.predict(data, [target])
pred_path = prediction.download('.')
pred_table = pd.read_csv(pred_path)
pred_table.head()
Complete 100.00% - [4/4] Computing column statistics
Complete
Downloaded './5fa1d9b2-f2a2-11ea-8ad5-0242ac110002_preds_42bcaa4d.csv'
[11]:
DEFAULT_PAYMENT_NEXT_MONTH.0 | DEFAULT_PAYMENT_NEXT_MONTH.1 | |
---|---|---|
0 | 0.719911 | 0.280089 |
1 | 0.785363 | 0.214637 |
2 | 0.797846 | 0.202154 |
3 | 0.751900 | 0.248100 |
4 | 0.784983 | 0.215017 |
We can also get the contribution each feature had to the final prediction by downloading the prediction data. This will give us an idea of how each feature effects the predictions.
[12]:
prediction = ex.predict(data, [target], include_shap_values=True)
pred_contributions_path = prediction.download('.')
pred_contributions_table = pd.read_csv(pred_contributions_path)
pred_contributions_table.head()
Complete
Downloaded '5fa1d9b2-f2a2-11ea-8ad5-0242ac110002_preds_e454e06b.csv'
[12]:
DEFAULT_PAYMENT_NEXT_MONTH.0 | DEFAULT_PAYMENT_NEXT_MONTH.1 | contrib_0_AGE | contrib_10_PAY_3 | contrib_11_PAY_4 | contrib_12_PAY_5 | contrib_13_PAY_6 | contrib_14_PAY_AMT1 | contrib_15_PAY_AMT2 | contrib_16_PAY_AMT3 | ... | contrib_45_Freq:EDUCATION:MARRIAGE:SEX | contrib_46_Freq:MARRIAGE:SEX | contrib_47_ClusterTE:ClusterID10:LIMIT_BAL:PAY_AMT3.0 | contrib_4_BILL_AMT4 | contrib_5_BILL_AMT5 | contrib_6_BILL_AMT6 | contrib_7_LIMIT_BAL | contrib_8_PAY_1 | contrib_9_PAY_2 | contrib_bias | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.719911 | 0.280089 | -0.007820 | 0.001811 | -0.027411 | -0.035804 | -0.016744 | -0.002835 | -0.002294 | 0.000895 | ... | 0.003506 | -0.008401 | -0.017656 | -0.001394 | -0.004778 | -0.016793 | 0.000000 | 0.310759 | 0.000000 | -1.329072 |
1 | 0.785363 | 0.214637 | 0.006818 | -0.009067 | -0.014888 | -0.015980 | -0.010554 | 0.032912 | -0.012962 | -0.061771 | ... | 0.000750 | -0.018531 | -0.007953 | -0.001594 | 0.002134 | 0.005670 | 0.007705 | -0.035739 | -0.012201 | -1.329072 |
2 | 0.797846 | 0.202154 | -0.014775 | -0.007831 | -0.016271 | -0.011541 | -0.003509 | 0.007780 | -0.060321 | -0.037985 | ... | -0.002463 | -0.011457 | 0.009479 | -0.005542 | -0.000527 | -0.010674 | -0.007385 | -0.034064 | -0.010538 | -1.329072 |
3 | 0.751900 | 0.248100 | -0.048815 | 0.003496 | 0.008609 | 0.044685 | 0.038594 | -0.004914 | -0.047627 | 0.074311 | ... | -0.228055 | -0.015196 | 0.076722 | -0.007128 | -0.007387 | 0.042457 | 0.000000 | 0.329534 | 0.000000 | -1.329072 |
4 | 0.784983 | 0.215017 | -0.002933 | -0.010469 | -0.059411 | -0.015114 | -0.008622 | 0.045974 | 0.041436 | 0.123106 | ... | 0.024956 | -0.015611 | -0.001108 | -0.002593 | 0.023487 | 0.000126 | 0.000000 | -0.035263 | -0.014483 | -1.329072 |
5 rows × 70 columns
We will examine the contributions for our first record more closely.
[13]:
contrib = pd.DataFrame(pred_contributions_table.iloc[0][1:])
contrib.columns = ["contribution"]
contrib["abs_contribution"] = contrib.contribution.abs()
contrib.sort_values(by="abs_contribution", ascending=False)[["contribution"]].head()
[13]:
contribution | |
---|---|
contrib_bias | -1.329072 |
contrib_34_ClusterTE:ClusterID10:PAY_1:PAY_3:PAY_5:PAY_AMT1:PAY_AMT5.0 | 0.912805 |
contrib_8_PAY_1 | 0.310759 |
contrib_36_ClusterTE:ClusterID20:LIMIT_BAL:PAY_1.0 | 0.289717 |
DEFAULT_PAYMENT_NEXT_MONTH.1 | 0.280089 |
The clusters from this customer’s: PAY_1
, PAY_3
, PAY_5
, and LIMIT_BAL
had the greatest impact on their prediction. Since the contribution is positive, we know that it increases the probability that they will default.
Build Scoring Pipelines¶
In our last section, we will build the scoring pipelines from our experiment. There are two scoring pipeline options:
Python Scoring Pipeline: requires Python runtime
MOJO Scoring Pipeline: requires Java runtime
Documentation on the scoring pipelines is provided here: http://docs.h2o.ai/driverless-ai/latest-stable/docs/userguide/python-mojo-pipelines.html.
The experiment screen shows two scoring pipeline buttons: Download Python Scoring Pipeline or Build MOJO Scoring Pipeline. Driverless AI determines if any scoring pipeline should be automatically built based on the config.toml file.
1. Build Python Scoring Pipeline¶
We can build the Python Scoring Pipeline using ExperimentArtifacts. The experiment artifacts can be accessed after the successful completion of an experiment.
[14]:
# Build the Python Scoring Pipeline
ex.artifacts.create('python_pipeline')
Building Python scoring pipeline...
Now we will download the scoring pipeline zip file.
[15]:
ex.artifacts.download(only='python_pipeline',
dst_dir='.',
overwrite=True)
Downloaded './scorer.zip'
[15]:
{'python_pipeline': './scorer.zip'}
2. Build MOJO Scoring Pipeline¶
We can build the MOJO Scoring Pipeline using the Python client. This is equivalent to selecting the Build MOJO Scoring Pipeline on the experiment screen.
[16]:
# Build the MOJO Scoring Pipeline
ex.artifacts.create('mojo_pipeline')
Building MOJO pipeline...
Now we can download the scoring pipeline zip file.
[17]:
ex.artifacts.download(only='mojo_pipeline', dst_dir='.', overwrite=True)
Downloaded './mojo.zip'
[17]:
{'mojo_pipeline': './mojo.zip'}
Once the MOJO Scoring Pipeline is built, the Build MOJO Scoring Pipeline changes to Download MOJO Scoring Pipeline.
[ ]: