Skip to main content
Version: 0.19.3

Spark dependencies

If you want to interact with Feature Store from a Spark session, several dependencies need to be added on the Spark Classpath. Supported Spark versions are 3.2.x.

Using S3 as the Feature Store storage:

  • io.delta:delta-core_2.12:2.4.0
  • org.apache.hadoop:hadoop-aws:${HADOOP_VERSION}
note

HADOOP_VERSION is the hadoop version your Spark is built for.

Version of delta-core library needs to match your Spark version. Version 2.4.0 can be used by Spark 3.4.

Using Azure Gen2 as the Feature Store storage:

  • io.delta:delta-core_2.12:2.4.0
  • featurestore-azure-gen2-spark-dependencies.jar
  • org.apache.hadoop:hadoop-azure:${HADOOP_VERSION}
note

HADOOP_VERSION is the hadoop version your Spark is built for.

Version of delta-core library needs to match your Spark version. Version 2.4.0 can be used by Spark 3.4.

The Spark dependencies jar can be downloaded from the Downloads page.

General configuration

Spark needs to be started with the following configuration to ensure that the time travel queries are correct:

  • spark.sql.session.timeZone=UTC
  • spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
  • spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog

If you do not have Apache Spark started, please start it first.


Feedback