Experiments
Overview​
The Experiments tab shows all fine-tuning jobs in your current project. You can view the status of each experiment, monitor training metrics, and configure new experiments from scratch or based on previous runs.
You can access the Experiments page from:
- The homepage card
- The top navigation bar
Experiment List View​
This page includes a table of all experiments in the current project. Each row includes:
- Name
- Experiment ID
- Associated Dataset
- Created Date
- Status (
Queued
,Starting
,Training
,Completed
) - Validation Loss (Min, Max, or Last)
- Validation Perplexity (Min, Max, or Last)
Most columns are optional and can be shown or hidden using the column toggle button in the top-right.
You can:
- Search by name, dataset, or ID
- Sort by status
- Enter edit mode to select and delete multiple experiments
Create a New Experiment​
Click the New Experiment button on the Experiments page to launch a new training run. You can also create an experiement from a Dataset page.
The form is divided into four sections:
Experiment Details​
- Experiment Name — Optional; autogenerated if left blank
- Problem Type
- Causal LM — Output is free-form text
- Classification LM — Output is a label (e.g., 0 or 1)
Dataset Selection​
Select your training dataset and define input/output columns.
- Train Dataset — Choose from available datasets
- Input Column — The prompt or input for the LLM
- Output Column — The expected output (label or response)
- Max Token Length — Maximum number of tokens for the output
- Data Sample — Fraction of the data used for training (default is 0.01)
- Increase this value if your dataset is small
- Validation Strategy
- Automatic Split — Data is split internally
- Custom Validation — Upload a separate validation dataset
- Validation Size — Proportion of data used for validation (if using automatic split)
Training Configuration​
Control the core training parameters:
- Model — Select a base model from the list of available models in your system. Only installed or registered models will appear in this dropdown.
- Training Mode — Choose between
Full Training
,LoRA
(default), orQLoRA
for parameter-efficient fine-tuning. By default, experiments use LoRA. - Batch Size — The number of samples processed together in one iteration of training. A larger batch size can allow faster training with sufficient GPU memory, while a smaller batch size may improve generalization on very small datasets.
- Learning Rate — The step size for model parameter updates during training. A higher learning rate makes changes to model weights more aggressive each iteration, while a lower rate makes training more gradual. Not sure what value to pick? Use AutoML to let an AI agent find the best learning rate and related parameters for your task.
- Epochs — The number of complete passes made by the training process through the entire training dataset. Increasing epochs lets the model see the data more times, but too many can lead to overfitting.
- Metrics — For classification, you can select from Accuracy, AUC, and LogLoss. For causal language modeling, you can select Perplexity and BLEU. You may track multiple metrics per experiment.
These settings may be auto-tuned via Ask KGM or AutoML.
Advanced Configuration​
This section accepts optional YAML to override or extend training behavior. You can copy/paste existing YAML templates or manually configure settings not available through the UI.
Controls​
- Reset Settings — Clear the current configuration
- AutoML Toggle — Automatically run a series of experiments to find the best config
- Start Training — Begin the experiment with the current settings
View an Experiment​
Click on any experiment row to open the experiment detail view.
Status and Resources​
The status bar updates live as the experiment runs. The stages include:
- Queued
- Starting
- Training / Validation
- Completed
Open the Resources section to view:
- Number of GPUs used
- Price per GPU-hour
- Total cost
- Runtime duration
Charts​
The chart visualizes training and validation progress:
- Training Loss
- Validation Loss
- Validation Perplexity
Loss is shown on the left axis, and perplexity is shown on the right axis.
Training vs. Validation:
Training metrics reflect how well the model is learning from the training dataset. Validation metrics measure generalization — how well the model performs on unseen examples. A good model typically has low validation loss and low perplexity without overfitting.
Configuration​
This section displays the raw YAML configuration used to run the experiment.
Connected Experiments​
This section shows experiments related to AutoML workflows or created via Ask KGM. If you ran a single experiment, this will show only your current experiment.
Training Logs and System Logs​
Training Logs include:
- Epoch progress
- Metric values
- Fine-tuning metadata (e.g., LoRA updates)
System Logs include runtime system activity, including any warnings or errors that occurred during training.
Experiment Actions​
At the top-right of the experiment page, you can:
- Deploy the trained model (see Deployments)
- Open the action menu (
...
) to:- Push to Hugging Face (requires your Hugging Face credentials)
- Ask KGM — Use a fine-tuning agent to suggest the next best experiment
- Rerun — Run the exact same experiment again
- Copy — Open the setup page pre-filled with the settings from this experiment
- Delete — Remove this experiment
Ask KGM​
Clicking "Ask KGM" opens a modal with a recommended next experiment. KGM stands for Kaggle Grandmaster — a reference to H2O.ai's expert data scientists.
The system may suggest:
- A different model backbone to fine-tune
- A different learning rate
- Adjustments based on your training metrics
You can review the explanation and click Proceed to launch the next suggested experiment.
- Submit and view feedback for this page
- Send feedback about H2O Enterprise LLM Studio to cloud-feedback@h2o.ai