Steps
Steps are individual actions that run within a job. They execute sequentially in the order you define them.
Each step must specify exactly one action: run, upload, or download.
Fields
name (optional)
Display name of the step. Use this field to describe what action the step performs.
Type: string
Example:
steps:
- name: Load training data
run: python load_data.py
working_dir (optional)
Working directory for this step. This setting overrides the job-level working_dir if set.
Type: string
Behavior:
- Sets the current working directory for this step
- Affects where shell commands in
runexecute - Affects how relative paths in
upload.pathanddownload.pathresolve - Absolute paths are not affected by
working_dir
Inheritance:
- If the job has
working_dir, all steps inherit it - A step can override with its own
working_dir - Step override takes precedence
Example:
steps:
- name: Train baseline model
working_dir: ./experiments/baseline
run: python train.py
- name: Train advanced model
working_dir: ./experiments/advanced
run: python train.py
With upload and download:
steps:
- name: Upload model from experiment
working_dir: ./experiments
upload:
path: models/ # Relative to ./experiments
destination: drive://bucket/models/
Default: Inherits from the job's working_dir, or uses the home directory if not set.
run (optional)
Shell commands to execute in this step. Each step's run invocation executes in a new shell session.
Type: string
Format: Supports single-line or multi-line strings using YAML's | or > syntax.
Shell session behavior:
- Each step with
runstarts a new shell session - Environment variables from
envare available in the shell - The shell exits after command execution completes
- State is not preserved between steps
Examples:
Single-line command:
steps:
- name: Install dependencies
run: pip install -r requirements.txt
Multi-line script:
steps:
- name: Train and validate
run: |
python preprocess.py
python train.py
echo "Training complete!"
With environment variables:
steps:
- name: Train model
env:
MODEL_TYPE: xgboost
run: |
echo "Training ${MODEL_TYPE} model"
python train.py --model ${MODEL_TYPE}
upload (optional)
Upload files or folders to H2O Drive. Supports single files, folders, and glob patterns.
Type: Upload object with path and destination fields
Example:
steps:
- name: Upload trained model
upload:
path: models/
destination: drive://my-bucket/models/trained/
For path behavior, glob patterns, and Drive URL format, see Storage.
download (optional)
Download files or folders from H2O Drive. Supports single files and folders.
Type: Download object with source and path fields
Example:
steps:
- name: Download dataset
download:
source: drive://my-bucket/datasets/train.csv
path: ./data/train.csv
For download behavior and path handling, see Storage.
timeout (optional)
Maximum execution time for this step. The system terminates the step if execution exceeds this duration.
Type: Duration string (for example, "10m", "2h")
For duration format, scope levels, and precedence rules, see Timeouts.
continue_on_error (optional)
Continue job execution even if this step fails.
Type: bool
Default: false (step failure causes job failure)
For detailed behavior and interaction with fail-fast, see Failure handling.
env (optional)
Environment variables for this step.
Type: Map of string to string
For scope, inheritance, and precedence rules, see Environment variables.
Step execution
- Steps within a job run sequentially in the order you define them
- Each step runs after the previous step completes
- If a step fails and
continue_on_erroris nottrue, subsequent steps in that job do not run
Complete example
jobs:
process:
name: Process and Upload Data
working_dir: ./project
timeout: "2h"
env:
LOG_LEVEL: info
steps:
- name: Download raw data
download:
source: drive://data-bucket/raw/dataset.csv
path: ./data/raw.csv
- name: Preprocess data
timeout: "30m"
run: |
python preprocess.py --input ./data/raw.csv --output ./data/processed.csv
echo "Preprocessing complete"
- name: Train model
timeout: "1h"
env:
MODEL_TYPE: xgboost
run: python train.py --data ./data/processed.csv --model $MODEL_TYPE
- name: Run optional validation
continue_on_error: true
run: python validate.py --strict
- name: Upload results
upload:
path: ./output/
destination: drive://results-bucket/output/
- Submit and view feedback for this page
- Send feedback about H2O Orchestrator | Docs to cloud-feedback@h2o.ai