Use a Custom Transformer¶
First, we'll initialize a client with our server credentials and store it in the variable dai
.
In [1]:
Copied!
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username='py', password='py')
import driverlessai
dai = driverlessai.Client(address='http://localhost:12345', username='py', password='py')
Here we grab a custom recipe from our recipe repo (https://github.com/h2oai/driverlessai-recipes) and upload it to the Driverless AI server.
In [23]:
Copied!
dai.recipes.create('https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/numeric/boxcox_transformer.py')
dai.recipes.create('https://github.com/h2oai/driverlessai-recipes/blob/master/transformers/numeric/boxcox_transformer.py')
Complete 100%
It's also possible to use the same dai.recipes.create()
function to upload recipes that we have written locally.
In [24]:
Copied!
dai.recipes.create('sum.py')
dai.recipes.create('sum.py')
Complete 100%
We can create a list of custom transformer recipe objects.
In [25]:
Copied!
custom_transformers = [t for t in dai.recipes.transformers.list() if t.is_custom]
display(custom_transformers)
custom_transformers = [t for t in dai.recipes.transformers.list() if t.is_custom]
display(custom_transformers)
[<class 'driverlessai.recipes.TransformerRecipe'> BoxCoxTransformer, <class 'driverlessai.recipes.TransformerRecipe'> SumTransformer]
For demonstration purposes, we'll grab the first dataset available on the server.
Then, we'll use it to get an experiment preview.
Note that BoxCox
and Sum
are now the only transformers in the 'Feature engineering search space'.
In [26]:
Copied!
ds = dai.datasets.list()[0]
dai.experiments.preview(
train_dataset=ds,
target_column=ds.columns[-1],
task='classification',
transformers=custom_transformers
)
ds = dai.datasets.list()[0]
dai.experiments.preview(
train_dataset=ds,
target_column=ds.columns[-1],
task='classification',
transformers=custom_transformers
)
ACCURACY [7/10]: - Training data size: *150 rows, 5 cols* - Feature evolution: *[Constant, DecisionTree, LightGBM, XGBoostGBM]*, *3-fold CV**, 2 reps* - Final pipeline: *Ensemble (6 models), 3-fold CV* TIME [2/10]: - Feature evolution: *8 individuals*, up to *42 iterations* - Early stopping: After *5* iterations of no improvement INTERPRETABILITY [8/10]: - Feature pre-pruning strategy: Permutation Importance FS - Monotonicity constraints: enabled - Feature engineering search space: [BoxCox, Sum] [Constant, DecisionTree, LightGBM, XGBoostGBM] models to train: - Model and feature tuning: *192* - Feature evolution: *288* - Final pipeline: *6* Estimated runtime: *minutes* Auto-click Finish/Abort if not done in: *1 day*/*7 days*