Steps

Steps are individual actions that run within a job. They execute sequentially in the order you define them.

Note

Each step must specify exactly one action: run, upload, or download.

Fields

`name` (optional)

Display name of the step. Use this field to describe what action the step performs.

Type: string

Example:

steps:
  - name: Load training data
    run: python load_data.py

`working_dir` (optional)

Working directory for this step. This setting overrides the job-level working_dir if set.

Type: string

Behavior:

Sets the current working directory for this step
Affects where shell commands in run execute
Affects how relative paths in upload.path and download.path resolve
Absolute paths are not affected by working_dir

Inheritance:

If the job has working_dir, all steps inherit it
A step can override with its own working_dir
Step override takes precedence

Example:

steps:
  - name: Train baseline model
    working_dir: ./experiments/baseline
    run: python train.py

  - name: Train advanced model
    working_dir: ./experiments/advanced
    run: python train.py

With upload and download:

steps:
  - name: Upload model from experiment
    working_dir: ./experiments
    upload:
      path: models/            # Relative to ./experiments
      destination: drive://bucket/models/

Default: Inherits from the job's working_dir, or uses the home directory if not set.

`run` (optional)

Shell commands to execute in this step. Each step's run invocation executes in a new shell session.

Type: string

Format: Supports single-line or multi-line strings using YAML's | or > syntax.

Shell session behavior:

Each step with run starts a new shell session
Environment variables from env are available in the shell
The shell exits after command execution completes
State is not preserved between steps

Examples:

Single-line command:

steps:
  - name: Install dependencies
    run: pip install -r requirements.txt

Multi-line script:

steps:
  - name: Train and validate
    run: |
      python preprocess.py
      python train.py
      echo "Training complete!"

With environment variables:

steps:
  - name: Train model
    env:
      MODEL_TYPE: xgboost
    run: |
      echo "Training ${MODEL_TYPE} model"
      python train.py --model ${MODEL_TYPE}

`upload` (optional)

Upload files or folders to H2O Drive. Supports single files, folders, and glob patterns.

Type: Upload object with path and destination fields

Example:

steps:
  - name: Upload trained model
    upload:
      path: models/
      destination: drive://my-bucket/models/trained/

For path behavior, glob patterns, and Drive URL format, see Storage.

`download` (optional)

Download files or folders from H2O Drive. Supports single files and folders.

Type: Download object with source and path fields

Example:

steps:
  - name: Download dataset
    download:
      source: drive://my-bucket/datasets/train.csv
      path: ./data/train.csv

For download behavior and path handling, see Storage.

`timeout` (optional)

Maximum execution time for this step. The system terminates the step if execution exceeds this duration.

Type: Duration string (for example, "10m", "2h")

For duration format, scope levels, and precedence rules, see Timeouts.

`continue_on_error` (optional)

Continue job execution even if this step fails.

Type: bool

Default: false (step failure causes job failure)

For detailed behavior and interaction with fail-fast, see Failure handling.

`env` (optional)

Environment variables for this step.

Type: Map of string to string

For scope, inheritance, and precedence rules, see Environment variables.

Step execution

Steps within a job run sequentially in the order you define them
Each step runs after the previous step completes
If a step fails and continue_on_error is not true, subsequent steps in that job do not run

Complete example

jobs:
  process:
    name: Process and Upload Data
    working_dir: ./project
    timeout: "2h"
    env:
      LOG_LEVEL: info
    steps:
      - name: Download raw data
        download:
          source: drive://data-bucket/raw/dataset.csv
          path: ./data/raw.csv

      - name: Preprocess data
        timeout: "30m"
        run: |
          python preprocess.py --input ./data/raw.csv --output ./data/processed.csv
          echo "Preprocessing complete"

      - name: Train model
        timeout: "1h"
        env:
          MODEL_TYPE: xgboost
        run: python train.py --data ./data/processed.csv --model $MODEL_TYPE

      - name: Run optional validation
        continue_on_error: true
        run: python validate.py --strict

      - name: Upload results
        upload:
          path: ./output/
          destination: drive://results-bucket/output/

Feedback

Submit and view feedback for this page
Send feedback about H2O Orchestrator | Docs to cloud-feedback@h2o.ai

Fields​

name (optional)​

working_dir (optional)​

run (optional)​

upload (optional)​

download (optional)​

timeout (optional)​

continue_on_error (optional)​

env (optional)​

Step execution​

Complete example​

Fields

`name` (optional)

`working_dir` (optional)

`run` (optional)

`upload` (optional)

`download` (optional)

`timeout` (optional)

`continue_on_error` (optional)

`env` (optional)

Step execution

Complete example