Skip to main content

Step structure

Steps are individual actions that run within a job. They execute sequentially in the order defined.

Schema

See Schema Reference for the complete #Step definition.

Note: Steps must specify exactly one action: run, upload, or download.

Fields

name (optional)

The display name of the step. Describes what action the step performs.

Type: string

Example:

name: Load training data

working_dir (optional)

Working directory for this step. Overrides the job-level working_dir if set.

Type: string

Behavior:

  • Sets the current working directory for this step.
  • Affects where shell commands in run execute.
  • Affects how relative paths in upload.path and download.path are resolved.
  • Absolute paths are not affected by working_dir.

Inheritance:

  • If job has working_dir, all steps inherit it.
  • Step can override with its own working_dir.
  • Step override takes precedence.

Examples:

steps:
- name: Train baseline model
working_dir: ./experiments/baseline
run: python train.py

- name: Train advanced model
working_dir: ./experiments/advanced
run: python train.py

With upload/download:

steps:
- name: Upload model from experiment
working_dir: ./experiments
upload:
path: models/ # Relative to ./experiments
destination: drive://bucket/models/

Default: Inherits from job's working_dir, or home directory if not set

run (optional)

Shell command(s) to execute in this step. Each step's run invocation executes in a new shell session.

Type: string

Format: Supports single-line or multi-line strings using YAML's | or > syntax

Shell Session Behavior:

  • Each step with run starts a new shell session.
  • Environment variables from env are available in the shell.
  • The shell exits after command execution completes.
  • State is not preserved between steps.

Examples:

Single-line command:

steps:
- name: Install dependencies
run: pip install -r requirements.txt

Multi-line script:

steps:
- name: Train and validate
run: |
python preprocess.py
python train.py
echo "Training complete!"

With environment variables:

steps:
- name: Train model
env:
MODEL_TYPE: xgboost
run: |
echo "Training ${MODEL_TYPE} model"
python train.py --model ${MODEL_TYPE}

Default: None (step performs no shell execution if omitted)

upload (optional)

Upload files or folders to H2O Drive. Supports single files, folders, and glob patterns.

Fields: path (local path/glob), destination (Drive URL: drive://bucket/path)

Example:

upload:
path: models/
destination: drive://my-bucket/models/trained/

See Storage for path behavior, glob patterns, and Drive URL format.

download (optional)

Download files or folders from H2O Drive. Supports single files and folders.

Fields: source (Drive URL: drive://bucket/path), path (local destination)

Example:

download:
source: drive://my-bucket/datasets/train.csv
path: ./data/train.csv

See Storage for download behavior and path handling.

timeout (optional)

Maximum execution time for this step (e.g., "10m", "2h"). Step is terminated if exceeded.

See Timeouts for duration format, scope levels, and precedence rules.

continue_on_error (optional)

Continue job execution even if this step fails. Defaults to false (step failure causes job failure).

See Failure Handling for detailed behavior and interaction with fail-fast.

env (optional)

Environment variables for this step.

See Environment Variables for scope, inheritance, and precedence rules.

Step Execution

  • Steps within a job run sequentially in the order they are defined.
  • Each step runs after the previous step completes.
  • If a step fails and continue_on_error is not set to true, subsequent steps in that job do not run.

Feedback