Version: v1.7.0

Use a knowledge graph for collection retrieval

Graph RAG builds a knowledge graph from entities and relationships in collection documents.
You can use Graph RAG to retrieve context across documents when standard retrieval returns isolated chunks.

Overview

Graph RAG uses two retrieval signals:

Standard hybrid search (vector + lexical).
Graph based traversal across connected entities.

Use Graph RAG when you need reasoning across documents:

Your answer depends on facts from multiple documents.
Your data has indirect relationships across projects, vendors, and customers.
Your collection is large and standard RAG misses connected context.

How Graph RAG works

Graph RAG runs in two phases.

Phase 1: Graph building

When you build a knowledge graph, Enterprise h2oGPTe:

Sends each document chunk to an LLM for entity extraction.
Merges entities and relationships into one graph.
Embeds entity descriptions for retrieval.
Stores the graph in object storage (MinIO/S3).

Phase 2: Graph augmented retrieval

When you query with Graph RAG, the system:

Runs standard hybrid search to retrieve top chunks.
Runs a graph query to find related entities.
Expands retrieval with chunks linked to those entities.
Adds graph analysis to the LLM context.
Generates a response from chunk context and graph context.

Build a knowledge graph

Prerequisites

You have a Collection with ingested documents.
Your deployment has at least one LLM for entity extraction.

Steps

Open the target Collection page.
Open the action menu (three-dot icon) next to Add Documents in the Documents section.
Click Build Knowledge Graph.
Select the LLM for entity extraction.
Start the build.

You can also build the graph from the Chat side panel. Open the action menu next to Add Documents in the Documents section of the side panel.

If the action is not available

If you do not see Build Knowledge Graph in the action menu, verify your role permissions. Ensure that at least one extraction LLM is configured and available. Ask your administrator to enable Knowledge Graph for your deployment version.

Track progress in the notification tray:

Number of chunks processed.
Selected LLM.
Elapsed time.

tip

Entity extraction is the most time-consuming step. Build times increase linearly with the number of chunks. A collection with about 80 chunks typically takes 1-2 minutes.

Graph status

After the build, the Collection page and the Chat side panel show the graph status next to the Documents heading:

Graph Ready (green): The graph is built and current.
Graph Outdated (yellow): New documents were added after the last build.
Graph Building (blue): A build is in progress.
Graph Failed (red): The build failed.

Rebuild vs. Update

Update Knowledge Graph: Processes only new documents since the last build.
Rebuild Knowledge Graph: Rebuilds the graph from scratch.

Use Update Knowledge Graph after you add documents.
Use Rebuild Knowledge Graph after you delete or heavily modify documents.

Query with Graph RAG

Prerequisites

The selected Collection has a completed knowledge graph.
You can open a chat session with that collection.

Steps

Open a Chat session with a Collection that has a built knowledge graph.
In the Chat settings, set Generation Approach to Graph RAG.
Ask your question.

Graph RAG only appears in the Generation Approach dropdown after a knowledge graph has been built for the collection. If you do not see it, build the graph first from the action menu on the Collection page or the Chat side panel.

If a query returns No knowledge graph found for this collection, the graph may have been deleted or the status is not ready. Return to the collection and rebuild the graph.

Use Graph RAG for questions that require reasoning across documents:

"What is the connection between Project X and customer Y?"
"What incidents were caused by vendor Z?"
"How much revenue is at risk due to the firmware delay?"

For factual questions from one document, standard RAG usually performs similarly.

Admin configuration

Deployment settings

Administrators can configure Graph RAG with environment variables:

H2OGPTE_CORE_GRAPH_RAG_LLM (default: auto): Selects the extraction LLM. auto picks the lowest-cost non-reasoning model.
H2OGPTE_CORE_GRAPH_RAG_COST_CONTROLS (default: {"max_cost_per_million_tokens": 5, "model": null}): Sets automatic model limits and allowlists.
H2OGPTE_CORE_GRAPH_RAG_MAX_PARALLEL_INSERT (default: 40): Controls chunk parallelism during graph build.
H2OGPTE_CORE_GRAPH_RAG_LLM_MAX_ASYNC (default: 80): Sets the maximum number of concurrent LLM calls.
H2OGPTE_CORE_GRAPH_RAG_ENTITY_EXTRACT_MAX_GLEANING (default: 0): Adds extra extraction passes per chunk. 0 disables extra passes.
H2OGPTE_CORE_GRAPH_RAG_FORCE_LLM_SUMMARY_ON_MERGE (default: 999999): Controls when merge-time LLM summarization runs. Use a very high value to disable it.

Restricting available models

To limit model choices for graph building, set model in H2OGPTE_CORE_GRAPH_RAG_COST_CONTROLS:

H2OGPTE_CORE_GRAPH_RAG_COST_CONTROLS: '{"max_cost_per_million_tokens": 5, "model": ["meta-llama/Llama-3.1-8B-Instruct", "gpt-4.1-nano"]}'

When you set model to a list, users only see those models in the graph build dropdown.

Use Graph RAG with the Python client

Build a knowledge graph

Prerequisites

You have a valid address and api_key.
You know the target collection_id.

Steps

from h2ogpte import H2OGPTE

client = H2OGPTE(address="https://...", api_key="...")

# Build the graph
job = client.build_collection_graph(
    collection_id="<collection-id>",
    llm="meta-llama/Llama-3.1-8B-Instruct",  # optional, defaults to auto
    force_rebuild=False,  # True to rebuild from scratch
    timeout=600,
)
print(f"Graph built in {job.duration}")

Parameter guidance:

Set llm to choose a specific extraction model.
Set force_rebuild=True to rebuild from scratch.
Increase timeout for large collections.

Check graph status

Use this call to confirm readiness before querying:

status = client.get_collection_graph_status(collection_id="<collection-id>")
print(f"Status: {status.status}")       # 'none', 'building', 'ready', 'failed'
print(f"Built at: {status.built_at}")
print(f"Outdated: {status.outdated}")   # True if new docs added since build

Query with Graph RAG

Use rag_config={"rag_type": "graph_rag"} to force graph based retrieval:

chat_session_id = client.create_chat_session(collection_id="<collection-id>")

with client.connect(chat_session_id) as session:
    reply = session.query(
        "What is the connection between Project Atlas and customer retention?",
        rag_config={"rag_type": "graph_rag"},
    )
    print(reply.content)

Known limitations

Build time scales with document count: Large collections can take longer to build.
Entity merge ambiguity: Unrelated topics can produce incorrect merges for shared names such as "GPU" or "API."
Model quality: Smaller extraction models can miss entities or relationships.
Incremental update scope: Updates process new documents only; modifications and deletions require a rebuild.

Feedback

Submit and view feedback for this page
Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai

Overview​

How Graph RAG works​

Phase 1: Graph building​

Phase 2: Graph augmented retrieval​

Build a knowledge graph​

Prerequisites​

Steps​

Graph status​

Rebuild vs. Update​

Query with Graph RAG​

Prerequisites​

Steps​

Admin configuration​

Deployment settings​

Restricting available models​

Use Graph RAG with the Python client​

Build a knowledge graph​

Prerequisites​

Steps​

Check graph status​

Query with Graph RAG​

Known limitations​

Overview

How Graph RAG works

Phase 1: Graph building

Phase 2: Graph augmented retrieval

Build a knowledge graph

Prerequisites

Steps

Graph status

Rebuild vs. Update

Query with Graph RAG

Prerequisites

Steps

Admin configuration

Deployment settings

Restricting available models

Use Graph RAG with the Python client

Build a knowledge graph

Prerequisites

Steps

Check graph status

Query with Graph RAG

Known limitations