Version: v1.6.37-dev1 🚧

Key terms

Enterprise h2oGPTe uses several key terms across its documentation, and each, in turn, is explained on this page.

Collection

A Collection refers to a group of related Documents. A Collection lets you aggregate documents in one location. You can utilize Collections to group particular sets of material (content) to later explore individually through Chats (asking questions to a Collection).

Enterprise h2oGPTe supports Retrieval Augmented Generation (RAG) when getting responses from an LLM, which allows for contextualizing the question to the LLM with information from documents, audio transcriptions, and other data. Users can create one or more data Collections from which they want to get answers or generate new content. When a user interacts with an LLM, the user’s prompt is compared with the Collection of documents to find similar chunks of information. This information is then sent to the LLM.

There are many strategies for importing and creating Collections so that you get the best responses for your use case. For more information, see Collections usage overview.

Job

A Job signifies a single/batch of crawling or indexing tasks. In particular, the following tasks are referred to as a Job:

Ingest plain text
Ingest a Document from the file system
Ingest from cloud storage
Ingest (add) a Document from upload
Ingest (crawl) a website
Conver files to a PDF
Index Document(s)
Update a Collection's statistics
Delete a Document(s)
Delete a Document(s) from a Collection
Delete a Collection(s)
Import a stored Document to a Collection
Import all Document(s) from a Collection to another Collection
Summarize a Document
Process a Document(s)

Document

A Document refers to one of your imported files to Enterprise h2oGPTe (for example, a PDF or web page).

Chat

A Chat session is an interaction between you and Enterprise h2oGPTe that consists of a series of prompts and answers.

API Key

An application programming interface (API) key is a unique identifier to authenticate to the h2oGPTe API.

Extractors

Extractors, defined by JSON schemas, play a crucial role in document AI, transforming unstructured document content into structured, actionable data. With Extractors, you can effortlessly retrieve information from any document—whether it’s a CV, invoice, Form 10-K, or scanned image—without the need for complex setups or annotations. Just specify the data you need with an intuitive JSON schema builder (UI), upload your documents, and receive structured data instantly.

PII Detection

Personally Identifiable Information (PII) detection is the process of recognizing and classifying sensitive data within a dataset that can be used to identify a specific individual. This includes information like social security numbers, credit card numbers, bank account numbers, and passport numbers. Non-sensitive PII includes information like names, addresses, and phone numbers.

Evaluators

Evaluators are tools and metrics used to assess the performance and quality of large language models (LLMs) and Retrieval-Augmented Generation (RAG) models. They also evaluate a Collection's performance, reliability, security, fairness, and effectiveness.

Feedback

Submit and view feedback for this page
Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai

Collection​

Job​

Document​

Chat​

API Key​

Extractors​

PII Detection​

Evaluators​