Driverless AI MOJO Scoring Pipeline - C++ Runtime with Python and R Wrappers¶
The C++ Scoring Pipeline is provided as R and Python packages for the protobuf-based MOJO2 protocol. The packages are self contained, so no additional software is required. Simply build the MOJO Scoring Pipeline and begin using your preferred method.
These scoring pipelines are currently not available for RuleFit models.
The Download MOJO Scoring Pipeline button appears as Build MOJO Scoring Pipeline if the MOJO Scoring Pipeline is disabled.
You can have Driverless AI attempt to reduce the size of the MOJO scoring pipeline when it is being built by enabling the Attempt to Reduce the Size of the MOJO expert setting.
Downloading the Scoring Pipeline Runtimes¶
The R and Python packages can be downloaded from within the Driverless AI application. To do this, click Resources, then click MOJO2 R Runtime and MOJO2 Py Runtime from the drop-down menu. In the pop-up menu that appears, click the button that corresponds to the OS you are using. Choose from Linux, Mac OS X, and IBM PowerPC.
The following examples show how to use the R and Python APIs of the C++ MOJO runtime.
R Example¶
Linux OS (x86 or PPC) or Mac OS X (10.9 or newer)
Driverless AI License (either file or environment variable)
Running the MOJO2 R Runtime¶
# Install the R MOJO runtime using one of the methods below
# Install the R MOJO runtime on PPC Linux
# Install the R MOJO runtime on x86 Linux
#Install the R MOJO runtime on Mac OS X
# Load the MOJO
m <- load.mojo("./mojo-pipeline/pipeline.mojo")
# retrieve the creation time of the MOJO
## [1] "2019-11-18 22:00:24 UTC"
# retrieve the UUID of the experiment
## [1] "65875c15-943a-4bc0-a162-b8984fe8e50d"
# Load data and make predictions
col_class <- setNames(feature.types(m), feature.names(m)) # column names and types
d <- fread("./mojo-pipeline/example.csv", colClasses=col_class, header=TRUE, sep=",")
predict(m, d)
## label.B label.M
## 1 0.08287659 0.91712341
## 2 0.77655075 0.22344925
## 3 0.58438434 0.41561566
## 4 0.10570505 0.89429495
## 5 0.01685609 0.98314391
## 6 0.23656610 0.76343390
## 7 0.17410333 0.82589667
## 8 0.10157948 0.89842052
## 9 0.13546191 0.86453809
## 10 0.94778244 0.05221756
Python Example¶
Linux OS (x86 or PPC) or Mac OS X (10.9 or newer)
Driverless AI License (either file or environment variable)
Python 3.6
datatable. Run the following to install:
# Install on Linux PPC, Linux x86, or Mac OS X pip install datatable
Non-binary version of protobuf:
pip install --no-binary=protobuf protobuf
Python MOJO runtime. Run one of the following commands after downloading from the GUI:
# Install the MOJO runtime on Linux PPC pip install daimojo-2.5.10-cp36-cp36m-linux_ppc64le.whl # Install the MOJO runtime on Linux x86 pip install daimojo-2.5.10-cp36-cp36m-linux_x86_64.whl # Install the MOJO runtime on Mac OS X pip install daimojo-2.5.10-cp36-cp36m-macosx_10_7_x86_64.whl
Running the MOJO2 Python Runtime¶
# import the daimojo model package
import daimojo.model
# specify the location of the MOJO
m = daimojo.model("./mojo-pipeline/pipeline.mojo")
# retrieve the creation time of the MOJO
# 'Mon November 18 14:00:24 2019'
# retrieve the UUID of the experiment
# retrieve a list of missing values
# ['',
# '?',
# 'None',
# 'nan',
# 'NA',
# 'N/A',
# 'unknown',
# 'inf',
# '-inf',
# '1.7976931348623157e+308',
# '-1.7976931348623157e+308']
# retrieve the feature names
# ['clump_thickness',
# 'uniformity_cell_size',
# 'uniformity_cell_shape',
# 'marginal_adhesion',
# 'single_epithelial_cell_size',
# 'bare_nuclei',
# 'bland_chromatin',
# 'normal_nucleoli',
# 'mitoses']
# retrieve the feature types
# ['float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32',
# 'float32']
# retrieve the output names
# ['label.B', 'label.M']
# retrieve the output types
# ['float64', 'float64']
# import the datatable module
import datatable as dt
# parse the example.csv file
pydt = dt.fread("./mojo-pipeline/example.csv", na_strings=m.missing_values, header=True, separator=',')
# clump_thickness uniformity_cell_size uniformity_cell_shape marginal_adhesion single_epithelial_cell_size bare_nuclei bland_chromatin normal_nucleoli mitoses
# 0 8 1 3 10 6 6 9 1 1
# 1 2 1 2 2 5 3 4 8 8
# 2 1 1 1 9 4 10 3 5 4
# 3 2 6 9 10 4 8 1 1 3
# 4 10 10 8 1 8 3 6 3 4
# 5 1 8 4 5 10 1 2 5 3
# 6 2 10 2 9 1 2 9 3 8
# 7 2 8 9 2 10 10 3 5 4
# 8 6 3 8 5 2 3 5 3 4
# 9 4 2 2 8 1 2 8 9 1
# [10 rows × 9 columns]
# retrieve the column types
# (stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64,
# stype.float64)
# make predictions on the example.csv file
res = m.predict(pydt)
# retrieve the predictions
# label.B label.M
# 0 0.0828766 0.917123
# 1 0.776551 0.223449
# 2 0.584384 0.415616
# 3 0.105705 0.894295
# 4 0.0168561 0.983144
# 5 0.236566 0.763434
# 6 0.174103 0.825897
# 7 0.101579 0.898421
# 8 0.135462 0.864538
# 9 0.947782 0.0522176
# [10 rows × 2 columns]
# retrieve the prediction column names
# ('label.B', 'label.M')
# retrieve the prediction column types
# (stype.float64, stype.float64)
# convert datatable results to common data types
# res.to_pandas() # need pandas
# res.to_numpy() # need numpy