Skip to main content
Version: 1.2.0

Feature Store & Sparkling Water integration

Python Sparkling Water

  1. In a Python environment, pip install the featurestore client.
  2. Download spark and pysparkling by following the instructions from the Sparkling Water documentation.
  3. Start the pysparkling session with the Spark dependencies.
./bin/pysparkling --jars <spark dependency jar file>

Example:

from featurestore import Client
ref = fs.retrieve()
data_frame = ref.as_spark_frame(spark)

# sparklingwater
from pysparkling import *
hc = H2OContext.getOrCreate()
from pysparkling.ml import H2OGLM
estimator = H2OGLM(labelCol = "RainTomorrow")
model = estimator.fit(data_frame)

Scala Sparkling Water

  1. Download the Spark dependency jar and Scala client jar.
  2. Start the sparkling shell with the jars
./bin/sparkling-shell --jars <spark dependency jar file>, <scala client jar file>

Example:

import ai.h2o.featurestore.Client
val ref = fs.retrieve()
val dataFrame = ref.asSparkFrame(spark)

// sparklingwater
import ai.h2o.sparkling._
val hc = H2OContext.getOrCreate()
import ai.h2o.sparkling.ml.algos.H2OGLM
val estimator = new H2OGLM().setLabelCol("RainTomorrow")
val model = estimator.fit(dataFrame)

Feedback