Skip to main content
Version: 1.2.0

Feature view API

Creating a feature view

To create a feature view, you need to build a query. You build a query by selecting features from feature sets, joining feature sets together, and by applying filters. You can also apply specific transformations through a feature view query. These transformations are supported:

  • min_max_scaler
  • standard_scaler
  • robust_scaler
  • string_indexer
note

During join transformations, Feature Store performs point in time inner or left joins

To create query with inner join execute:

from featurestore.core.entities.query import Query

min_max = client.transformation_functions.get("min_max_scaler")

query = Query.select([feature_set1.features["UserId"], feature_set1.features["Label"], min_max.apply(feature_set2.features["X"])]) \
.from_feature_set(feature_set1, "alias1") \
.join(feature_set2, "alias2").on(feature_set1.features["UserId"], feature_set2.features["UserId"]) \
.end()

To create query with left join execute:

from featurestore.core.entities.query import Query

min_max = client.transformation_functions.get("min_max_scaler")

query = Query.select([feature_set1.features["UserId"], feature_set1.features["Label"], min_max.apply(feature_set2.features["X"])]) \
.from_feature_set(feature_set1, "alias1") \
.left_join(feature_set2, "alias2").on(feature_set1.features["UserId"], feature_set2.features["UserId"]) \
.end()

To create feature view execute:

feature_view = project.feature_views.create(name = "test", description="", query)

Listing feature views within a project

project.feature_views.list()

Obtaining a feature view

feature_view = project.feature_views.get("feature_view_name", version=None)

If the version is not specified, the latest version of the feature view is returned.

Deleting feature views

fv = project.feature_views.get("name")
fv.delete()

Updating feature view fields

To update the field, simply call the setter of that field:

fv = project.feature_views.get("name")
fv.description = "description"

Creating a new feature view version

The query for a feature view cannot be updated directly. To change the query, you need to create a new version of the feature view with the updated query.

To create a new version of the feature view, you can use the create_new_version method of the feature view object and pass the updated query as a parameter to the method. The query retrieves the data from the data source and updates the feature view with the new data.

fv = project.feature_views.get("name")
query = Query.select([fs_1.features["abc"], fs_1.features["xyz"]]).from_feature_set(fs_1,"alias1").join(fs_2,"alias2").on(fs_1.features["pqr"], fs_2.features["mno"]).end() # Define the query to update the feature view
fv.create_new_version(query)

Obtaining data as a Spark Frame

You can read the data directly as a Spark Frame:

data_frame = my_feature_view.as_spark_frame(spark_session, start_at=None, end_at=None)

Read more about Spark dependencies.

Parameters Explanation:

If start_at and end_at are empty, all ingested data are fetched. Otherwise, these parameters are used to retrieve only a specific range of ingested data. For example, when ingested data are in a time range between T1 <= T2, start_date_time can have any value T3 and end_date_time can have any value T4, where T1 <= T3 <= T4 <= T2.

Downloading the files from Feature Store

You can download the data to your local machine by:

dir = my_feature_view.download(start_at=None, end_at=None)

Parameters Explanation:

If start_at and end_at are empty, all ingested data are fetched. Otherwise, these parameters are used to retrieve only a specific range of ingested data. For example, when ingested data are in a time range between T1 <= T2, start_date_time can have any value T3 and end_date_time can have any value T4, where T1 <= T3 <= T4 <= T2.

Creating a machine learning dataset

Creating a machine learning (ML) dataset allows you to materialize a feature view into the Feature Store. To create a machine learning dataset in a Feature Store, you can call the create method of the ml_datasets object of the Feature Store. You need to provide a name for the ML dataset, and if required, you can also specify the time period for which you want to include data in your ML dataset.

ml_dataset = my_feature_view.ml_datasets.create("name", start_date_time=None, end_date_time=None)

Parameters Explanation:

If start_date_time and end_date_time are empty, all ingested data are fetched. Otherwise, these parameters are used to retrieve only a specific range of ingested data. For example, when ingested data are in a time range between T1 <= T2, start_date_time can have any value T3 and end_date_time can have any value T4, where T1 <= T3 <= T4 <= T2.

Obtaining data as a Spark Frame from the ML dataset

ml_dataset = my_feature_view.ml_datasets.get("name")
data_frame = ml_dataset.as_spark_frame(sparkSession)

Downloading the files from Feature Store from the ML dataset

You can download the data to your local machine by:

ml_dataset = my_feature_view.ml_datasets.get("name")
dir = ml_dataset.download()

Retrieving data from online feature store

Once the ML dataset is created and the job finished, you can retrieve the latest feature value from the online store. To retrieve these feature values, you have to provide all primary keys to the feature sets. All transformations defined in the query will be applied during this retrieval by a pipeline created during the creation of the ML dataset.

ml_dataset = my_feature_view.ml_datasets.get("name")
ml_dataset.retrieve_online(1)

Feature view and ML dataset permissions

The permission model of the project and feature sets is inherited by feature views and ML datasets that are created within that project and feature set.

In other words, any permissions that apply to a project and feature set, also apply to feature views and ML datasets created within a particular project and feature sets. For more information, see Permissions.


Feedback