Run an Evaluator on a Chat
Overviewβ
Enterprise h2oGPTe offers several Evaluators to assess the quality and safety of large language model (LLM) responses during Chat sessions, leveraging the same robust Evaluators in H2O Eval Studio. To learn about the available Evaluators for a Chat response, see Chat Evaluators.
Instructionsβ
To run an Evaluator on a Chat response (LLM response), consider the following steps:
- In the Enterprise h2oGPTe navigation menu, click Chats.
- In the Recent Chats table, click the Preview of the Chat you want to evaluate.
- Locate the Chat (LLM) response you want to evaluate, then clickΒ Evaluate.
- In the Evaluator list, select an Evaluator.
note
To learn about each available Evaluator for a Chat response, see Chat Evaluators.
- Click Evaluate.
note
Enterprise h2oGPTe displays a generated Chat Evaluator in the Eval tab, but it is not saved; if you navigate away, it will be lost.
Chat Evaluatorsβ
This section lists all available Evaluators for a Chat (LLM) response.
Toxicityβ
At a high level, this Evaluator helps you determine if the LLM's response contains harmful, offensive, or abusive language that could negatively impact users or violate platform guidelines. To learn more about this Evaluator, see Toxicity Evaluator.
Hallucinationβ
This Evaluator identifies whether the LLM's response includes fabricated or inaccurate information that doesn't align with the provided context or factual data. To learn more about this Evaluator, see Hallucination Evaluator.
Personally Identifiable Information (PII) leakageβ
This Evaluator checks if the LLM's response inadvertently reveals sensitive personal data, such as names, addresses, phone numbers, or other details that could be used to identify an individual. To learn more about this Evaluator, see PII Leakage Evaluator.
Sensitive data leakageβ
This Evaluator detects if the LLM discloses confidential or protected information, such as proprietary business data, medical records, or classified content, which could result in security or privacy breaches. To learn more about this Evaluator, see Sensitive Data Leakage Evaluator.
Fairness biasβ
This Evaluator assesses whether the LLM's responses exhibit bias or unfair treatment based on gender, race, ethnicity, or other demographic factors, ensuring that the model's output is impartial and equitable. To learn more about this Evaluator, see Fairness Bias Evaluator.
- Submit and view feedback for this page
- Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai