View an evaluation dataset
Overview
After creating your own evaluation dataset through the H2O LLM DataStudio user interface, you can view a summary table with the following information about your current evaluation datasets:
- ID: The unique ID of the custom Eval project.
- Name: The name of the custom Eval project.
- Description: A detailed summary highlighting the custom Eval project’s purpose and objectives.
- Filename: The name of the uploaded document file.
- Eval dataset type: The type of the evaluation dataset (question type, multi-choice, token presence).
- Created: The date when the project was initially created.
- Number of entries: The number of entries (rows) on the dataset.
- Status: The status of each project. It indicates the progress of the project (Example:
Running
,Complete
).
View a specific Custom Eval project
The following steps describe how to view a specific Custom Eval project in H2O LLM DataStudio.
- On the H2O LLM DataStudio left navigation menu, click Custom Eval.
- To interact with a specific project, click on the project name.
Inside the Custom Eval project you selected, you can view the following details.
- Status: The status of each project. It indicates the progress of the project (Example:
Running
,Complete
). - Pairs: The number of question-answer pairs generated in the evaluation dataset.
- Reload/logs: Click to review the events and actions that occurred during the process of creating an evaluation dataset.
- Label:
- Select the question-answer pairs from the table and click to mark them as irrelevant.
- Select the question-answer pairs from the table and click to mark them as relevant.
- Edit Q:A pairs: Select a question-answer pair and click to edit the dataset entries. Click Update records to update the dataset.
- Input:
- View document: Click to view the uploaded PDF documents.
- Output:
- Click
JSON
orCSV
file formats. to download the curated question-answer pairs in
- Click
- Generate robust eval dataset: Select one or more rows from the generated question-answer pairs and click to generate a robust evaluation dataset. The robust evaluation dataset contains the new questions generated based on the original question, the answer, and the original question.
Feedback
- Submit and view feedback for this page
- Send feedback about H2O LLM DataStudio | Docs to cloud-feedback@h2o.ai