Skip to main content
Version: Next

Access a Collection's settings


The settings of a Collection allows users to modify various aspects of the Collection, including its name, description, prompt settings, and access permissions. By updating the Collection name and description, users can provide a more accurate and detailed overview of the Collection's content. Additionally, by adjusting the prompt settings, users can customize the prompts the Collection utilizes.

To invite other authenticated users to access the Collection, users can select their email addresses from the list of available users. Once selected, these users will be granted access to the Collection and can view and interact with its content. Only authenticated users can be invited to access a Collection, and unauthenticated users cannot access the Collection even if they have the link.


To update the settings of a Collection, consider the following instructions:


You can only update the settings of your Collections.

  1. In the Enterprise h2oGPTe navigation menu, click Collections.
  2. In the following tab, select the Collection you want to edit its settings: My Collections.
  3. Click Settings.
  4. In the Settings list, select Settings. Settings
  5. Perform any configurations you want.

    For detailed information about each setting, see Collection settings.

  6. Click Update.

Collection settings

The Collection settings section includes the following settings:

Collection name

This setting defines the name of the Collection.


This setting defines the description of the Collection.

Embedding model

This setting displays the embedding model you selected for the Collection when it was created.

Share with

This setting defines the users that can access the Collection. You can choose the email addresses of other authenticated users with whom you wish to share the Collection.

Prompt template to use

This setting enables you to choose a prompt template to customize the prompts utilized within the Collection. You can create your prompt template on the Prompts page and apply it to your Collection.


The Reset setting restores the default prompt settings. To restore the default prompt settings, consider the following steps:

  1. Click Reset.
  2. Click Update.

Generation approach (RAG type to use)

This setting defines the generation approach for responses. Enterprise h2oGPTe offers the following methods for generating responses to answer user's queries (Chats):

  • LLM Only (no RAG)

    This option generates a response to answer the user's query solely based on the Large Language Model (LLM) without considering supporting Document contexts from the Collection.

  • RAG (Retrieval Augmented Generation)

    This option utilizes a neural/lexical hybrid search approach to find relevant contexts from the Collection based on the user's query for generating a response. Applicable when the prompt is easily understood and the context contains enough information to come up with a correct answer.

    RAG first performs a vector search for similar chunks limited by the number of chunks sorted by distance metric. By default, Enterprise h2oGPTe chooses the top 25 chunks using lexical distance and top 25 using neural distance. The distance metric is calculated by the cross entropy loss from the BAAI/bge-reranker-large model. These chunks are passed to the selected LLM to answer the user's query. Note that Enterprise h2oGPTe lets you view the exact prompt passed to the LLM.

  • HyDE RAG (Hypothetical Document Embeddings)

    This option extends RAG with neural/lexical hybrid search by utilizing the user's query and the LLM response to find relevant contexts from the Collection to generate a response. It requires two LLM calls. Applicable when the prompt is somewhat ambiguous or the context does not contain enough information to come up with a correct answer.

    HyDE (Hypothetical Document Embeddings) is essentially the same as RAG except that it does not simply search for the embeddings with the smallest distance to the query. Instead, it first asks an LLM to try to answer the question. It then uses the question and the hypothetical answer to search for the nearest chunks.

    Example question: What are the implications of high interest rate?

    • RAG: Searches for chunks in the document with a small distance to the embedding of the question: "What are the implications of high interest rate?"

    • Hyde RAG:

      1. Asks an LLM: "What are the implications of high interest rate?"
      2. LLM answers: "High interest rates can have several implications, including: higher borrowing cost, slower economic growth, increased savings rate, higher returns on investment, exchange rate fluctuation, ..."
      3. RAG searches for chunks in the document with a small distance to the embedding of the question AND the answer from Step 2. This effectively increases the potentially relevant chunks.
  • HyDE RAG+ (Combined HyDE+RAG)

    This option utilizes RAG with neural/lexical hybrid search by using both the user's query and the HyDE RAG response to find relevant contexts from the collection to generate a response. It requires three LLM calls. Applicable when the prompt is very ambiguous or the context contains conflicting information and it's very difficult to come up with a correct answer.

  • RAG+ (RAG without LLM context limit)

    This option utilizes RAG (Retrieval Augmented Generation) with neural/lexical hybrid search using the user's query to find relevant contexts from the Collection for generating a response. It uses the recursive summarization technique to overcome the LLM's context limitations. The process requires multiple LLM calls. Applicable when the prompt is asking for a summary of the context or a lengthy answer such as a procedure that might require multiple large pieces of information to process.

    The vector search is repeated as in RAG but this time k neighboring chunks are added to the retrieved chunks. These returned chunks are then sorted in the order they appear in the document so that neighboring chunks stay together. The expanded set of chunks is essentially a filtered sub-document of the original document, but more pertinent to the user's question. Enterprise h2oGPTe then summarizes this sub-document while trying to answer the user's question. This step uses the summary API, which applies the prompt to each context-filling chunk of the sub-document. It then takes the answers and joins 2+ answers and subsequently applies the same prompt, recursively reducing until only one answer remains.

    The benefit of this additional complexity is that if the answer is throughout the document, this mode is able to include more information from the original document as well as neighboring chunks for additional context.