Skip to main content

Create a leaderboard

In H2O Eval Studio, leaderboards compare LLMs based on metrics calculated by the available evaluators. (For more information on evaluators in H2O Eval Studio, see Evaluators.) You can view This page describes how to create a leaderboard in H2O Eval Studio.

  1. In the main navigation, click Leaderboards.

  2. Click the New Leaderboard button.

  3. Enter a name for the leaderboard.

  4. Enter a description of the leaderboard.

  5. Select a connection to the host of the LLM models you want to evaluate. Note that when creating leaderboards, there are two types of connections: LLM and RAG. The list of available evaluators and tests depends on the type of connection you select. (For example, operating the RAGAs evaluator on a pure LLM model is not applicable.) For more information on adding a connection, see Add connection.

  6. Select the evaluator you want to use. For more information on the available evaluators, see Evaluators.

  7. Select the tests that you want to use. For more information on tests in H2O EvalStudio, see Tests.

  8. (Optional) Select the LLM Models you want to use for the evaluation. Note that if you don’t select any, you’ll get an evaluation for all LLM Models.

  9. Click the Create button.

View a leaderboard

The table on the Leaderboards page lists all of the leaderboards that you have created. To view a leaderboard, click the name of the leaderboard you want to view.

When viewing a specific leaderboard, you can view a visualization of the leaderboard, obtain an HTML report, and download a zip archive with the evaluation results.

View a visualization of a leaderboard

The leaderboard page features a visualization of evaluator result metrics as a radar plot (in cases where more than one metric is produced by the evaluator) or bar chart (in cases where there are three or fewer metrics produced by the evaluator). Leaderboard visualizations can help you understand the evaluation results for a given metric and the LLM models being compared.

Obtain an HTML report of a leaderboard

To view an HTML report of a leaderboard, click the Show Report button. This report provides in-depth information about potential problems with the model, evaluation parameters, the evaluated models, and more.

Download a zip archive with evaluation results

To download a zip archive with evaluation results, click the Download Report button.

Delete a leaderboard

To delete a leaderboard from the Leaderboards page, select the checkbox next to the leaderboard you want to delete, and then click the Delete button.