Schema Metadata

Define tables, dimensions, metrics, and relationships for better SQL.

Basic Operations

Create table schema

schema = client.schema_metadata.create(
    project_id=project.id,
    name="Users Table",
    schema_data={
        "table": {
            "name": "users",
            "columns": [
                {"name": "id", "type": "INTEGER"},
                {"name": "email", "type": "VARCHAR(255)"}
            ]
        }
    },
)

Create dimension:

dimension = client.schema_metadata.create(
    project_id=project.id,
    name="User Status",
    schema_data={
        "table": {
            "dimension": {
                "name": "status",
                "content": {"type": "categorical", "values": ["active","inactive"]}
            }
        }
    },
)

Create metric:

metric = client.schema_metadata.create(
    project_id=project.id,
    name="Total Revenue",
    schema_data={
        "table": {
            "metric": {
                "name": "total_revenue",
                "content": {"aggregation": "sum", "column": "amount"}
            }
        }
    },
)

Create relationship:

relationship = client.schema_metadata.create(
    project_id=project.id,
    name="User Orders Relationship",
    schema_data={
        "relationship": {
            "from_table": "users",
            "to_table": "orders",
            "from_column": "id",
            "to_column": "user_id",
            "type": "one_to_many"
        }
    },
)

List/get/update/delete:

schemas = client.schema_metadata.list(project_id=project.id)
one = client.schema_metadata.get(project_id=project.id, schema_metadata_id=schema.id)
updated = client.schema_metadata.update(project_id=project.id, schema_metadata_id=schema.id, description="Updated")
client.schema_metadata.delete(project_id=project.id, schema_metadata_id=schema.id)

Filter by type

tables = client.schema_metadata.list_by_type(project.id, "table")

Validation helpers

errors = client.schema_metadata.validate_schema({"table": {"name": "users", "columns": []}}, "table")
stype = client.schema_metadata.get_schema_type({"table": {"name": "users", "columns": []}})

Working with Large Tables (Schema Splitting)

What is Schema Splitting?

When you create a table schema with more than 8 columns, the API automatically splits it into multiple parts. This improves RAG performance by creating smaller, more focused chunks for retrieval.

Understanding the Return Type

The create() method returns:

Single object (SchemaMetadataResponse) for schemas with ≤8 columns
List (List[SchemaMetadataResponse]) for schemas with >8 columns

Handling Split Results

Always check the return type:

result = client.schema_metadata.create(
    project_id=project.id,
    name="Large Customer Table",
    schema_data={
        "table": {
            "name": "customers",
            "columns": [
                {"name": "id", "type": "INTEGER"},
                {"name": "email", "type": "VARCHAR(255)"},
                {"name": "first_name", "type": "VARCHAR(100)"},
                {"name": "last_name", "type": "VARCHAR(100)"},
                {"name": "phone", "type": "VARCHAR(20)"},
                {"name": "address", "type": "VARCHAR(255)"},
                {"name": "city", "type": "VARCHAR(100)"},
                {"name": "state", "type": "VARCHAR(50)"},
                {"name": "zip", "type": "VARCHAR(10)"},
                {"name": "country", "type": "VARCHAR(50)"},
            ]
        }
    }
)

# Handle both cases
if isinstance(result, list):
    print(f"Schema split into {len(result)} parts")
    split_group_id = result[0].split_group_id
    for part in result:
        print(f"Part {part.split_index}/{part.total_splits}: {part.id}")
else:
    print(f"Single schema created: {result.id}")

Retrieving Split Groups

Get all parts of a split schema:

# Get complete split group information
split_group = client.schema_metadata.get_split_group(
    project_id=project.id,
    split_group_id=split_group_id
)

print(f"Total parts: {split_group['total_parts']}")
for part in split_group['parts']:
    print(f"Part {part.split_index}: {part.name}")

Deletion Behavior

Deleting any part of a split schema automatically deletes all parts in the group:

# Delete just one part - all parts are removed
client.schema_metadata.delete(
    project_id=project.id,
    schema_metadata_id=result[0].id  # Deletes entire group
)

Best Practices

Always use isinstance() to check if result is a list
Store split_group_id if you need to retrieve all parts later
Delete any single part - no need to delete each part individually
Consider column organization - group related columns in first 8 for better relevance

Basic Operations​

Create table schema​

Filter by type​

Validation helpers​

Working with Large Tables (Schema Splitting)​

What is Schema Splitting?​

Understanding the Return Type​

Handling Split Results​

Retrieving Split Groups​

Deletion Behavior​

Best Practices​

See Also​