Skip to main content
Version: Next

Key terms

h2oGPTe uses several key terms across its documentation, and each, in turn, is explained on this page.

Collection

A Collection refers to a group of related documents.

Enterprise h2oGPTe supports Retrieval Augmented Generation (RAG) when getting responses from an LLM, which allows for contextualizing the question to the LLM with information from documents, audio transcriptions, and other data. Users can create one or more collections of data that they want to get answers about or generate new content from. When a user interacts with an LLM, the user’s prompt is compared with the collection of documents to find similar chunks of information. This information is then sent to the LLM.

There are many strategies for importing and creating collections so that you get the best responses for your use case. For more information, see Collections usage overview.

Job

A Job in Enterprise h2oGPTe signifies a single/batch of crawling or indexing tasks. In particular, the following tasks are referred to as a Job:

  • Ingest plain text
  • Ingest a Document from the file system
  • Ingest from cloud storage
  • Ingest (add) a Document from upload
  • Ingest (crawl) a website
  • Conver files to a PDF
  • Index Document(s)
  • Update a Collection's statistics
  • Delete a Document(s)
  • Delete a Document(s) from a Collection
  • Delete a Collection(s)
  • Import a stored Document to a Collection
  • Import all Document(s) from a Collection to another Collection
  • Summarize a Document
  • Process a Document(s)

Document

A Document refers to one of your imported files to Enterprise h2oGPTe (for example, a PDF or web page).

Chat

A Chat session is an interaction between you and Enterprise h2oGPTe that consists of a series of prompts and answers.

API Key

An application programming interface (API) key is a unique identifier to authenticate to the h2oGPTe API.

PII Detection

Personally Identifiable Information (PII) detection is the process of recognizing and classifying sensitive data within a dataset that can be used to identify a specific individual. This includes information like social security numbers, credit card numbers, bank account numbers, and passport numbers. Non-sensitive PII includes information like names, addresses, and phone numbers.


Feedback