Skip to main content
Version: Next

Dataset format: Text sequence to sequence

The data for a text sequence to sequence experiment can be formatted following format 1 or 2.

A CSV file.

csv_name.csv (1)(2)
  1. The available dataset connectors require the data for a text sequence to sequence experiment to be in a zip or CSV file.
    Note

    To learn how to upload your zip or CSV file as your dataset in H2O Hydrogen Torch, see Dataset connectors.

  2. A CSV file containing the following columns:
    • An input-text column containing/representing the input texts
    • An output-text column containing/representing the out put texts
    • An optional fold column containing cross-validation fold indexes
      Note

      The fold column can include integers (0, 1, 2, … , N-1 values or 1, 2, 3… , N values) or categorical values.


Feedback