Workflow syntax
This section is the reference for H2O Workflows YAML syntax. Each workflow you define is a YAML file that describes jobs, steps, inputs, triggers, and other configuration that the H2O Workflows platform compiles and executes. Each sub-page documents a specific aspect of the workflow definition language.
What you will find
- Workflow — Top-level workflow structure and fields.
- Jobs — Independent units of work, runners, and dependencies.
- Steps — Sequential operations within a job (run, upload, download, workflow_call).
- Inputs — Typed parameters for workflows (string, boolean, integer).
- Triggers — Schedule workflows with cron expressions.
- Expressions — Go template syntax for dynamic values.
- Concurrency — Prevent concurrent executions of the same workflow.
- Matrix Jobs — Run jobs across multiple configurations.
- Reusable Workflows — Call workflows from other workflows.
- Storage — Upload and download files to H2O Drive.
- Environment Variables — Static key-value configuration.
- Secrets — Reference secrets from H2O Secure Store.
- Failure Handling — Control behavior when jobs or steps fail.
- Timeouts — Maximum execution time for jobs and steps.
- Schema Reference — Complete CUE schema definitions.
Quick start
id: data-pipeline
name: Data Pipeline
jobs:
process:
name: Process Data
steps:
- name: Clone repository
run: git clone https://github.com/org/data-processing.git
- name: Download dataset
download:
source: drive://bucket/datasets/raw-data.csv
path: ./data.csv
- name: Process data
run: python data-processing/process.py --input data.csv --output results.json
- name: Upload results
upload:
path: results.json
destination: drive://bucket/processed/results.json
Feedback
- Submit and view feedback for this page
- Send feedback about H2O Workflows to cloud-feedback@h2o.ai