Skip to main content
Version: v1.1.7


h2oGPTe uses several key terms across its documentation, and each, in turn, is explained in the following sections.


GPT, short for Generative Pre-Trained Transformer, is an advanced open-source language model that utilizes transformer architectures to generate human-like text. It is trained on vast amounts of unlabeled text data from the internet, enabling it to understand and generate coherent and contextually relevant text. Unlike rule-based systems, GPT learns patterns and structures in text data to generate human-like responses.
For more information, see GPT (Generative Pre-Trained Transformer).


Retrieval-augmented generation (RAG) is an AI framework for improving the quality of responses generated by Large Language Models (LLMs) by grounding the model on external sources of knowledge. RAG-equipped chatbots absorb their information from a variety of sources, including databases, documents, and the internet, to provide accurate and contextually relevant responses. This is particularly useful when users have complex or multi-step queries. Using a RAG system contributes significantly towards making the business more agile, especially if the company has a customer-facing chatbot.
For more information, see Boosting LLMs to New Heights with Retrieval Augmented Generation.


A Large Language Model (LLM) is a type of AI model that uses deep learning techniques and uses massive datasets to analyze and generate human-like language. For example, many AI chatbots or AI search engines are powered by LLMs.

Generally speaking, LLMs can be characterized by the following parameters:

  • Size of the training dataset
  • Cost of training (computational power)
  • Size of the model (parameters)
  • Performance after training (or how well the model is able to respond to a particular question)

LLM Prompt

A Large Language Model (LLM) Prompt is a question or request you send to an LLM to generate a desired response. This can be a question you want the LLM to answer or a request for the LLM to complete. The goal of using an LLM Prompt is to elicit a specific response from the model, whether it be a piece of information, a summary, or a creative work.

Transformer Neural Networks

Neural networks are an efficient way to solve machine learning problems and can be used in various situations. Neural networks offer precision and accuracy. Finding the correct neural network for each project can increase efficiency. Recurrent neural networks (RNNs) remember previously learned predictions to help make future predictions with accuracy. Unlike RNNs, Transformer Neural Networks do not have a concept of timestamps. This enables them to pass through multiple inputs at once, making them a more efficient way to process data.
For more information, see Transformer Architecture.


Fine-Tuning refers to the process of taking a pre-trained language model and further training it on a specific task or domain to improve its performance on that task. It is an important technique used to adapt Large Language Models (LLMs) to specific tasks and domains.


In h2oGPTe, Self-Reflection asks another Large Language Model (LLM) for a reflection of the answer given to the question and the context provided. It can be used to evaluate the LLM’s performance.