Driverless AI MOJO Scoring Pipeline - C++ Runtime with Python and R Wrappers

The C++ Scoring Pipeline is provided as R and Python packages for the protobuf-based MOJO2 protocol. The packages are self contained, so no additional software is required. Simply build the MOJO Scoring Pipeline and begin using your preferred method.

Notes:

  • These scoring pipelines are currently not available for RuleFit models.
  • The Download MOJO Scoring Pipeline button appears as Build MOJO Scoring Pipeline if the MOJO Scoring Pipeline is disabled.

Downloading the Scoring Pipeline Runtimes

Linux OS

The R and Python packages can be download from within the Driverless AI application. To do this, click Resources, then click MOJO2 R Runtime and MOJO2 Py Runtime from the drop-down menu.

Mac OS

Download the R and Python packages from the following links:

Examples

The following examples show how to use the R and Python APIs of the C++ MOJO runtime.

R Example

Prerequisites

  • Linux OS (x86 or PPC) or Mac OS X (10.9 or newer)
  • Driverless AI License (either file or environment variable)
  • Rcpp (>=1.0.0)
  • data.table

Running the MOJO2 R Runtime

# Install the R MOJO runtime using one of the methods below

# Install the R MOJO runtime on PPC Linux
install.packages("./daimojo_2.1.12_ppc64le-linux.tar.gz")

# Install the R MOJO runtime on x86 Linux
install.packages("./daimojo_2.1.12_x86_64-linux.tar.gz")

#Install the R MOJO runtime on Mac OS X
install.packages("./daimojo_2.1.12_x86_64-darwin.tar.gz")


# Load the MOJO
library(daimojo)
m <- load.mojo("./mojo-pipeline/pipeline.mojo")

# retrieve the creation time of the MOJO
create.time(m)
## [1] "2019-11-18 22:00:24 UTC"

# retrieve the UUID of the experiment
uuid(m)
## [1] "65875c15-943a-4bc0-a162-b8984fe8e50d"

# Load data and make predictions
col_class <- setNames(feature.types(m), feature.names(m))  # column names and types

library(data.table)
d <- fread("./mojo-pipeline/example.csv", colClasses=col_class)

predict(m, d)
##       label.B    label.M
## 1  0.08287659 0.91712341
## 2  0.77655075 0.22344925
## 3  0.58438434 0.41561566
## 4  0.10570505 0.89429495
## 5  0.01685609 0.98314391
## 6  0.23656610 0.76343390
## 7  0.17410333 0.82589667
## 8  0.10157948 0.89842052
## 9  0.13546191 0.86453809
## 10 0.94778244 0.05221756

Python Example

Prerequisites

  • Linux OS (x86 or PPC) or Mac OS X (10.9 or newer)

  • Driverless AI License (either file or environment variable)

  • Python 3.6

  • datatable. Run the following to install:

    # Install on Linux PPC, Linux x86, or Mac OS X
    pip install datatable
    
  • Python MOJO runtime. Run one of the following commands after downloading from the GUI:

    # Install the MOJO runtime on Linux PPC
    pip install daimojo-2.1.12+master.106-cp36-cp36m-linux_ppc64le.whl
    
    # Install the MOJO runtime on Linux x86
    pip install daimojo-2.1.12+master.106-cp36-cp36m-linux_x86_64.whl
    
    # Install the MOJO runtime on Mac OS X
    pip install daimojo-2.1.12+master.106-cp36-cp36m-macosx_10_7_x86_64.whl
    

Running the MOJO2 Python Runtime

# import the daimojo model package
import daimojo.model

# specify the location of the MOJO
m = daimojo.model("./mojo-pipeline/pipeline.mojo")

# retrieve the creation time of the MOJO
m.created_time
# 'Mon November 18 14:00:24 2019'

# retrieve the UUID of the experiment
m.uuid

# retrieve a list of missing values
m.missing_values
# ['',
#  '?',
#  'None',
#  'nan',
#  'NA',
#  'N/A',
#  'unknown',
#  'inf',
#  '-inf',
#  '1.7976931348623157e+308',
#  '-1.7976931348623157e+308']

# retrieve the feature names
m.feature_names
# ['clump_thickness',
#  'uniformity_cell_size',
#  'uniformity_cell_shape',
#  'marginal_adhesion',
#  'single_epithelial_cell_size',
#  'bare_nuclei',
#  'bland_chromatin',
#  'normal_nucleoli',
#  'mitoses']

# retrieve the feature types
m.feature_types
# ['float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32',
#  'float32']

# retrieve the output names
m.output_names
# ['label.B', 'label.M']

# retrieve the output types
m.output_types
# ['float64', 'float64']

# import the datatable module
import datatable as dt

# parse the example.csv file
pydt = dt.fread("./mojo-pipeline/example.csv", na_strings=m.missing_values)
pydt
#     clump_thickness  uniformity_cell_size  uniformity_cell_shape  marginal_adhesion  single_epithelial_cell_size  bare_nuclei  bland_chromatin  normal_nucleoli  mitoses
# 0                 8                     1                      3                 10                            6            6                9                1        1
# 1                 2                     1                      2                  2                            5            3                4                8        8
# 2                 1                     1                      1                  9                            4           10                3                5        4
# 3                 2                     6                      9                 10                            4            8                1                1        3
# 4                10                    10                      8                  1                            8            3                6                3        4
# 5                 1                     8                      4                  5                           10            1                2                5        3
# 6                 2                    10                      2                  9                            1            2                9                3        8
# 7                 2                     8                      9                  2                           10           10                3                5        4
# 8                 6                     3                      8                  5                            2            3                5                3        4
# 9                 4                     2                      2                  8                            1            2                8                9        1

# [10 rows × 9 columns]

# retrieve the column types
pydt.stypes
# (stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64,
#  stype.float64)

# make predictions on the example.csv file
res = m.predict(pydt)

# retrieve the predictions
res
#           label.B     label.M
# 0     0.0828766       0.917123
# 1     0.776551        0.223449
# 2     0.584384        0.415616
# 3     0.105705        0.894295
# 4     0.0168561       0.983144
# 5     0.236566        0.763434
# 6     0.174103        0.825897
# 7     0.101579        0.898421
# 8     0.135462        0.864538
# 9     0.947782        0.0522176

# [10 rows × 2 columns]

# retrieve the prediction column names
res.names
#     ('label.B', 'label.M')

# retrieve the prediction column types
res.stypes
# (stype.float64, stype.float64)

# convert datatable results to common data types
# res.to_pandas()  # need pandas
# res.to_numpy()   # need numpy
res.to_list()