Skip to main content
Version: Next

Access a Collection's settings

Overview

The settings of a Collection allows users to modify various aspects of the Collection, including its name, description, prompt settings, and access permissions.

Instructions

To update the settings of a Collection, consider the following instructions:

note

You can only update the settings of your Collections.

  1. In the Enterprise h2oGPTe navigation menu, click Collections.
  2. Click the My Collections tab.
  3. Select the relevant Collection to edit its settings.
  4. Click Edit collection.
  5. In the Edit collection list, select Settings. Settings
  6. In the Collection Settings section, perform any configurations you want.
    note

    For detailed information about each setting, see Collection settings.

  7. Click Update.

Collection settings

The Collection settings section includes the following settings:

Collection name

This setting defines the name of the Collection.

Description

This setting defines the description of the Collection.

Embedding model

This setting displays the embedding model you selected for the Collection when it was created.

Tokens Per Chunk

This setting displays the approximate size of the text chunks (in tokens) for the Collection when it was created.

PII Detection

This setting displays the PII Detection you selected for the Collection when it was created.

Share with

This setting defines the authenticated users that can access the Collection. You can choose the email addresses of other authenticated users with whom you wish to share the Collection.

note

Only authenticated users can be invited to access a Collection, and unauthenticated users cannot access the Collection even if they have the URL to the Collection.

Default prompt template

This setting enables you to choose a prompt template to customize the prompts utilized within the Collection. You can create your prompt template on the Prompts page and apply it to your Collection.

Reset

The Reset setting restores the Default prompt template setting. To restore the default prompt settings, consider the following steps:

  1. Click Reset.
  2. Click Update.

Default generation approach (RAG type)

This setting defines the generation approach for responses. Enterprise h2oGPTe offers the following methods to generate responses to answer user's queries (Chats):

  • Automatic

    This option is the automatic selection of the generation approach. LLM Only (no RAG) type is not considered for Chats with Collections.

  • LLM Only

    This option generates a response to answer the user's query solely based on the Large Language Model (LLM) without considering supporting Document contexts from the Collection.

  • RAG (Retrieval Augmented Generation)

    This option utilizes a neural/lexical hybrid search approach to find relevant contexts from the Collection based on the user's query for generating a response. Applicable when the prompt is easily understood and the context contains enough information to come up with a correct answer.

    RAG first performs a vector search for similar chunks limited by the number of chunks sorted by distance metric. By default, Enterprise h2oGPTe chooses the top 25 chunks using lexical distance and top 25 using neural distance. The distance metric is calculated by the cross entropy loss from the BAAI/bge-reranker-large model. These chunks are passed to the selected LLM to answer the user's query. Note that Enterprise h2oGPTe lets you view the exact prompt passed to the LLM.

  • LLM Only + RAG composite

    This option extends RAG with neural/lexical hybrid search by utilizing the user's query and the LLM response to find relevant contexts from the Collection to generate a response. It requires two LLM calls. Applicable when the prompt is somewhat ambiguous or the context does not contain enough information to come up with a correct answer.

    HyDE (Hypothetical Document Embeddings) is essentially the same as RAG except that it does not simply search for the embeddings with the smallest distance to the query. Instead, it first asks an LLM to try to answer the question. It then uses the question and the hypothetical answer to search for the nearest chunks.

    Example question: What are the implications of high interest rate?

    • RAG: Searches for chunks in the document with a small distance to the embedding of the question: "What are the implications of high interest rate?"

    • LLM Only + RAG composite:

      1. Asks an LLM: "What are the implications of high interest rate?"
      2. LLM answers: "High interest rates can have several implications, including: higher borrowing cost, slower economic growth, increased savings rate, higher returns on investment, exchange rate fluctuation, ..."
      3. RAG searches for chunks in the document with a small distance to the embedding of the question AND the answer from step b. This effectively increases the potentially relevant chunks.
  • HyDE + RAG composite

    This option utilizes RAG with neural/lexical hybrid search by using both the user's query and the HyDE RAG response to find relevant contexts from the Collection to generate a response. It requires three LLM calls. Applicable when the prompt is very ambiguous or the context contains conflicting information and it's very difficult to come up with a correct answer.

  • Summary RAG

    This option utilizes RAG (Retrieval Augmented Generation) with neural/lexical hybrid search using the user's query to find relevant contexts from the Collection to generate a response. It uses the recursive summarization technique to overcome the LLM's context limitations. The process requires multiple LLM calls. Applicable when the prompt is asking for a summary of the context or a lengthy answer such as a procedure that might require multiple large pieces of information to process.

    The vector search is repeated as in RAG but this time k neighboring chunks are added to the retrieved chunks. These returned chunks are then sorted in the order they appear in the document so that neighboring chunks stay together. The expanded set of chunks is essentially a filtered sub-document of the original document, but more pertinent to the user's question. Enterprise h2oGPTe then summarizes this sub-document while trying to answer the user's question. This step uses the summary API, which applies the prompt to each context-filling chunk of the sub-document. It then takes the answers and joins 2+ answers and subsequently applies the same prompt, recursively reducing until only one answer remains.

    The benefit of this additional complexity is that if the answer is throughout the document, this mode is able to include more information from the original document as well as neighboring chunks for additional context.

  • All Data RAG

    This option is similar to summary RAG, but includes all document chunks, no matter how large the collection. It uses the recursive summarization technique to overcome the LLM's context limitations. The process requires multiple LLM calls and can be very computationally expensive, but will guarantee that no part of the document is excluded.


Feedback