Skip to main content

Tutorials

Learn about H2O LLM DataStudio a no-code application and toolkit designed to streamline data curation, preparation, and augmentation tasks for large language models (LLMs).

Learning path

The H2O LLM DataStudio tutorials are available for all the supported workflows. The workflows include:

Question and Answer

This tutorial describes the process of preparing a dataset that consists of contextual information, questions, and corresponding answers.

Text Summarization

This tutorial describes the process of preparing a dataset that consists of articles and their associated summaries.

Instruct Tuning

This tutorial describes the process of preparing a dataset that consists of prompts and their respective responses.

Human - Bot Conversations

This tutorial describes the process of preparing a dataset comprising multiple dialogues between human users and chatbots.

Continued PreTraining

This tutorial describes the process of preparing datasets with extensive texts for further pretraining of language models.


Feedback