View an evaluation dataset

Overview

After creating your own evaluation dataset through the H2O LLM DataStudio user interface, you can view a summary table with the following information about your current evaluation datasets:

ID: The unique ID of the custom Eval project.
Name: The name of the custom Eval project.
Description: A detailed summary highlighting the custom Eval project’s purpose and objectives.
Filename: The name of the uploaded document file.
Eval dataset type: The type of the evaluation dataset (question type, multi-choice, token presence).
Created: The date when the project was initially created.
Number of entries: The number of entries (rows) on the dataset.
Status: The status of each project. It indicates the progress of the project (Example: Running, Complete).

View a specific Custom Eval project

The following steps describe how to view a specific Custom Eval project in H2O LLM DataStudio.

On the H2O LLM DataStudio left navigation menu, click Custom Eval.
To interact with a specific project, click on the project name.

Inside the Custom Eval project you selected, you can view the following details.

Status: The status of each project. It indicates the progress of the project (Example: Running, Complete).
Pairs: The number of question-answer pairs generated in the evaluation dataset.
Reload/logs: Click to review the events and actions that occurred during the process of creating an evaluation dataset.
Label:
- Select the question-answer pairs from the table and click to mark them as irrelevant.
- Select the question-answer pairs from the table and click to mark them as relevant.
Edit Q:A pairs: Select a question-answer pair and click to edit the dataset entries. Click Update records to update the dataset.
Input:
- View document: Click to view the uploaded PDF documents.
Output:
- Click to download the curated question-answer pairs in JSON or CSV file formats.
Generate robust eval dataset: Select one or more rows from the generated question-answer pairs and click to generate a robust evaluation dataset. The robust evaluation dataset contains the new questions generated based on the original question, the answer, and the original question.

Feedback

Submit and view feedback for this page
Send feedback about H2O LLM DataStudio | Docs to cloud-feedback@h2o.ai

Overview​

View a specific Custom Eval project​

Overview

View a specific Custom Eval project