Version: v1.6.43-dev5 🚧

Run an Evaluator on a Collection

Overview

Enterprise h2oGPTe offers several Evaluators to assess a Collection's performance, reliability, security, fairness, and effectiveness. The available Evaluators for a Collection are based on the Evaluators in H2O Eval Studio. To learn about the available Evaluators for a Collection, see Collection Evaluators.

Instructions

To run an Evaluator on a Collection, consider the following steps:

In the Enterprise h2oGPTe navigation menu, click Collections.
From one of the following tabs, locate and select the Collection you want to evaluate.
- All collections
- My collections
- shared
Click Evaluations.
Click Run your first evaluation/New evaluation.
In the Evaluator list, select an Evaluator.
note
To learn about each available Evaluator for a Collection, see Collection Evaluators.
Click Evaluate.

Collection Evaluators

This section lists all available Evaluators for a Collection.

Toxicity

At a high level, this Evaluator helps you determine if the Collection's LLM responses contain harmful, offensive, or abusive language that could negatively impact users or violate platform guidelines. To learn more about this Evaluator, see Toxicity Evaluator.

Hallucination

This Evaluator identifies whether the Collections LLM responses include fabricated or inaccurate information that doesn't align with the provided context or factual data. To learn more about this Evaluator, see Hallucination Evaluator.

Personally Identifiable Information (PII) leakage

This Evaluator checks if the Collection's LLM responses inadvertently reveals sensitive personal data, such as names, addresses, phone numbers, or other details that could be used to identify an individual. To learn more about this Evaluator, see PII Leakage Evaluator.

Sensitive data leakage

This Evaluator detects if the Collection's LLM discloses confidential or protected information, such as proprietary business data, medical records, or classified content, which could result in security or privacy breaches. To learn more about this Evaluator, see Sensitive Data Leakage Evaluator.

Fairness bias

This Evaluator assesses whether the Collection's LLM responses exhibit bias or unfair treatment based on gender, race, ethnicity, or other demographic factors, ensuring that the model's output is impartial and equitable. To learn more about this Evaluator, see Fairness Bias Evaluator.

Feedback

Submit and view feedback for this page
Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai

Overview​

Instructions​

Collection Evaluators​

Toxicity​

Hallucination​

Personally Identifiable Information (PII) leakage​

Sensitive data leakage​

Fairness bias​