Use a knowledge graph for collection retrieval
Graph RAG builds a knowledge graph from entities and relationships in collection documents.
You can use Graph RAG to retrieve context across documents when standard retrieval returns isolated chunks.
Overview​
Graph RAG uses two retrieval signals:
- Standard hybrid search (
vector+ lexical). - Graph based traversal across connected entities.
Use Graph RAG when you need reasoning across documents:
- Your answer depends on facts from multiple documents.
- Your data has indirect relationships across projects, vendors, and customers.
- Your collection is large and standard RAG misses connected context.
How Graph RAG works​
Graph RAG runs in two phases.
Phase 1: Graph building​
When you build a knowledge graph, Enterprise h2oGPTe:
- Sends each document chunk to an LLM for entity extraction.
- Merges entities and relationships into one graph.
- Embeds entity descriptions for retrieval.
- Stores the graph in object storage (
MinIO/S3).
Phase 2: Graph augmented retrieval​
When you query with Graph RAG, the system:
- Runs standard hybrid search to retrieve top chunks.
- Runs a graph query to find related entities.
- Expands retrieval with chunks linked to those entities.
- Adds graph analysis to the LLM context.
- Generates a response from chunk context and graph context.
Build a knowledge graph​
Prerequisites​
- You have a Collection with ingested documents.
- Your deployment has at least one LLM for entity extraction.
Steps​
- Open the target Collection page.
- Open the action menu (three-dot icon) next to Add Documents in the Documents section.
- Click Build Knowledge Graph.
- Select the LLM for entity extraction.
- Start the build.
You can also build the graph from the Chat side panel. Open the action menu next to Add Documents in the Documents section of the side panel.
If you do not see Build Knowledge Graph in the action menu, verify your role permissions. Ensure that at least one extraction LLM is configured and available. Ask your administrator to enable Knowledge Graph for your deployment version.
Track progress in the notification tray:
- Number of chunks processed.
- Selected LLM.
- Elapsed time.
Entity extraction is the most time-consuming step. Build times increase linearly with the number of chunks. A collection with about 80 chunks typically takes 1-2 minutes.
Graph status​
After the build, the Collection page and the Chat side panel show the graph status next to the Documents heading:
- Graph Ready (green): The graph is built and current.
- Graph Outdated (yellow): New documents were added after the last build.
- Graph Building (blue): A build is in progress.
- Graph Failed (red): The build failed.
Rebuild vs. Update​
- Update Knowledge Graph: Processes only new documents since the last build.
- Rebuild Knowledge Graph: Rebuilds the graph from scratch.
Use Update Knowledge Graph after you add documents.
Use Rebuild Knowledge Graph after you delete or heavily modify documents.
Query with Graph RAG​
Prerequisites​
- The selected Collection has a completed knowledge graph.
- You can open a chat session with that collection.
Steps​
- Open a Chat session with a Collection that has a built knowledge graph.
- In the Chat settings, set Generation Approach to Graph RAG.
- Ask your question.
Graph RAG only appears in the Generation Approach dropdown after a knowledge graph has been built for the collection. If you do not see it, build the graph first from the action menu on the Collection page or the Chat side panel.
If a query returns No knowledge graph found for this collection, the graph may have been deleted or the status is not ready. Return to the collection and rebuild the graph.
Use Graph RAG for questions that require reasoning across documents:
- "What is the connection between Project X and customer Y?"
- "What incidents were caused by vendor Z?"
- "How much revenue is at risk due to the firmware delay?"
For factual questions from one document, standard RAG usually performs similarly.
Admin configuration​
Deployment settings​
Administrators can configure Graph RAG with environment variables:
H2OGPTE_CORE_GRAPH_RAG_LLM(default:auto): Selects the extraction LLM.autopicks the lowest-cost non-reasoning model.H2OGPTE_CORE_GRAPH_RAG_COST_CONTROLS(default:{"max_cost_per_million_tokens": 5, "model": null}): Sets automatic model limits and allowlists.H2OGPTE_CORE_GRAPH_RAG_MAX_PARALLEL_INSERT(default:40): Controls chunk parallelism during graph build.H2OGPTE_CORE_GRAPH_RAG_LLM_MAX_ASYNC(default:80): Sets the maximum number of concurrent LLM calls.H2OGPTE_CORE_GRAPH_RAG_ENTITY_EXTRACT_MAX_GLEANING(default:0): Adds extra extraction passes per chunk.0disables extra passes.H2OGPTE_CORE_GRAPH_RAG_FORCE_LLM_SUMMARY_ON_MERGE(default:999999): Controls when merge-time LLM summarization runs. Use a very high value to disable it.
Restricting available models​
To limit model choices for graph building, set model in H2OGPTE_CORE_GRAPH_RAG_COST_CONTROLS:
H2OGPTE_CORE_GRAPH_RAG_COST_CONTROLS: '{"max_cost_per_million_tokens": 5, "model": ["meta-llama/Llama-3.1-8B-Instruct", "gpt-4.1-nano"]}'
When you set model to a list, users only see those models in the graph build dropdown.
Use Graph RAG with the Python client​
Build a knowledge graph​
Prerequisites​
- You have a valid
addressandapi_key. - You know the target
collection_id.
Steps​
from h2ogpte import H2OGPTE
client = H2OGPTE(address="https://...", api_key="...")
# Build the graph
job = client.build_collection_graph(
collection_id="<collection-id>",
llm="meta-llama/Llama-3.1-8B-Instruct", # optional, defaults to auto
force_rebuild=False, # True to rebuild from scratch
timeout=600,
)
print(f"Graph built in {job.duration}")
Parameter guidance:
- Set
llmto choose a specific extraction model. - Set
force_rebuild=Trueto rebuild from scratch. - Increase
timeoutfor large collections.
Check graph status​
Use this call to confirm readiness before querying:
status = client.get_collection_graph_status(collection_id="<collection-id>")
print(f"Status: {status.status}") # 'none', 'building', 'ready', 'failed'
print(f"Built at: {status.built_at}")
print(f"Outdated: {status.outdated}") # True if new docs added since build
Query with Graph RAG​
Use rag_config={"rag_type": "graph_rag"} to force graph based retrieval:
chat_session_id = client.create_chat_session(collection_id="<collection-id>")
with client.connect(chat_session_id) as session:
reply = session.query(
"What is the connection between Project Atlas and customer retention?",
rag_config={"rag_type": "graph_rag"},
)
print(reply.content)
Known limitations​
- Build time scales with document count: Large collections can take longer to build.
- Entity merge ambiguity: Unrelated topics can produce incorrect merges for shared names such as "GPU" or "API."
- Model quality: Smaller extraction models can miss entities or relationships.
- Incremental update scope: Updates process new documents only; modifications and deletions require a rebuild.
- Submit and view feedback for this page
- Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai