View a Curate project
Overview
After creating Curate projects through the H2O LLM DataStudio user interface, you can view a summary table with the following information about your current projects:
- ID: The unique ID of the project.
- Name: The name of the project.
- Description: A detailed summary highlighting the project’s purpose and objectives.
- Filename: The name of the uploaded document file.
- Created: The date when the project was initially created.
- Q:A pairs: The number of question-answer pairs generated from the uploaded document.
- Number of documents: The number of documents uploaded to the project.
- Number of pages: The number of total pages of documents in the project.
- Status: The status of each project. It indicates the progress of the project (Example:
Running
,Completed
).
View a specific Curate project
The following steps describe how to view a specific Curate project in H2O LLM DataStudio.
- On the H2O LLM DataStudio left navigation menu, click Curate.
- To interact with a specific project, click on the project name.
Inside the Curate project you selected, you can view the following details.
Status: The status of each project. It indicates the progress of the project (Example:
Running
,Complete
).Project details: Click
to view project details.- The Project details tab shows the project name, project ID, project description, and the date and time when the project was initially created.
- The FastQA mode indicates whether the smart chunking mode was activated, or not. The Chunk sampling shows the sampling ratio that has been used to convert documents into question-answer pairs.
Pairs: The number of dataset entries generated from the uploaded document.
Reload/Logs: Click
to review the events and actions that occurred during the data curation process.View reference: Select one or more entries from the table and click
to view all the references for the selected entry.Label:
- Select the irrelevant entries from the table and click to mark them as irrelevant.
- Select the relevant entries from the table and click to mark them as relevant.
Edit Q:A pairs: Select a dataset entry and click
to edit the generated entries.Input:
- View document: Click to view the uploaded PDF documents.
Output:
Select the desired file type (JSON or CSV) from the dropdown menu. Click Execute to generate the question-answer pairs in the selected format.
noteYou can download the generated curation pairs even if the project fails or terminates during the curation process.
Select Publish as Preparation Project from the dropdown menu to use the generated question-answer pairs in the Data preparation flow. Once you click Execute, the dataset generated from the Data curation process will be ingested into the Data preparation flow.
Select Publish as Custom Eval Project from the dropdown menu to use the generated question-answer pairs to create your own evaluation dataset. Once you click Execute, the dataset generated from the Data curation process will be ingested into the Custom Eval flow.
Export to H2O Drive: Click
to export the project to H2O Drive.Use the search bar to search for specific questions.
The table of question-answer pairs includes the following details:
- Prompt: The question
- Answer: The corresponding answer
- Relevance: Indicates a similarity calculation between the context and the answer. Each question-answer pair has a relevance score assigned. The relevance score is calculated as the ratio of matching sequences between the created answer and the original context which the answer was generated from. A relevance score of 1 indicates that the answer is directly quoted from the context. A relevance score of 0 means the answer has no overlapping words with the original context. The relevance score helps in filtering out questions that are most relevant.
- Filename: The name of the document from which the specific question-answer pair was generated. You can filter the question-answer pair based on the filename.
- Flag: Indicates the question-answer pairs that have been identified as irrelevant. You can filter the question-answer pair based on the flag.
- Submit and view feedback for this page
- Send feedback about H2O LLM DataStudio | Docs to cloud-feedback@h2o.ai