Skip to main content
Version: Next

Dataset format: Text metric learning

The data for a text metric learning experiment can be formatted following format 1 or 2.

A CSV file.

csv_name.csv (1)(2)
  1. The available dataset connectors require the data for a text metric learning experiment to be in a zip or CSV file.
    Note

    To learn how to upload your zip or CSV file as your dataset in H2O Hydrogen Torch, see Dataset connectors.

  2. A CSV file containing the following columns:
    • A text column containing the input texts
    • A label column containing the class names
      Note

      Texts that are similar should have the same class name.

    • An optional fold column containing cross-validation fold indexes
      Note

      The fold column can include integers (0, 1, 2, … , N-1 values or 1, 2, 3… , N values) or categorical values.


Feedback