Delta Table Setup

Use the Driverless AI Delta Table connector to explore data stored in Delta Lake. This page shows how to configure the connector and run queries.

Note

The Delta Table connector makes Driverless AI the SQL query execution engine. When querying large Delta Tables, allocate sufficient memory to avoid out-of-memory (OOM) errors.

Performance

  • For best performance, use the Delta Table connector only when other data connectors aren’t viable. If your Delta Tables are hosted on a platform with a SQL query engine (for example, Databricks), use an appropriate connector (such as the Databricks connector or JDBC connector) to offload query execution to that engine.

  • By default, the JVM heap size for the connector is "-Xmx4g". Increase this for large tables using delta_table_app_jvm_args. For example:

    delta_table_app_jvm_args = "-Xmx10g"
    

Configuration

Enable the connector

Enable the Delta Table connector by adding delta_table to the list of enabled file systems:

enabled_file_systems = "upload, file, hdfs, s3, recipe_file, recipe_url, delta_table"

Note

Enabling this option uses No Auth Authentication by default.

Azure Workload Identity (optional)

If your Delta Tables are stored in Azure and you authenticate using Microsoft Entra Workload Identity, set the following values:

azure_workload_identity_tenant_id = "12345678-1234-1234-1234-123456789012"
azure_workload_identity_client_id = "87654321-4321-4321-4321-210987654321"
azure_workload_identity_token_file_path = "/var/run/secrets/azure/tokens/azure-identity-token"

Use the connector

  1. To add the Delta Table connector, complete the following steps:

    1. Go to the DATASETS page.

    2. Click + ADD DATASET (OR DRAG & DROP).

    3. Select DELTA TABLE.

    Driverless AI Add Dataset dialog with Delta Table option highlighted in the list of available data connectors

    Select Delta Table from the data source options

  2. Enter the Delta Table path and the SQL query.

    Delta Table query configuration dialog with fields for entering the Delta Table path and SQL query statement

    Configure the Delta Table path and SQL query

    Note

    You can also specify the Delta Table path directly in the SQL query.

    SQL query editor showing an example query with the Delta Table path specified directly in the SQL statement

    Specify the path to your Delta Table location in the SQL query

  3. After you set the configuration parameters and SQL query, click CLICK TO MAKE QUERY. The query executes, and the results are displayed.

    Driverless AI Datasets page displaying the imported Delta Table data in a preview grid with column names and sample rows

    Query results displayed after successful execution