Workflow syntax overview
H2O Orchestrator workflows define automated pipelines using YAML syntax. You can create workflows that orchestrate data processing, model training, deployment, and other ML operations.
Key concepts
A workflow consists of three main components:
- Workflow: The top-level container that defines the pipeline name, inputs, and jobs.
- Jobs: Independent units of work that run in parallel or sequentially based on dependencies.
- Steps: Sequential actions within a job, such as shell commands or file transfers.
Quick start
The following example shows a basic data pipeline workflow:
id: data-pipeline
name: Data Pipeline
jobs:
process:
name: Process Data
steps:
- name: Clone repository
run: git clone https://github.com/org/data-processing.git
- name: Download dataset
download:
source: drive://bucket/datasets/raw-data.csv
path: ./data.csv
- name: Process data
run: python data-processing/process.py --input data.csv --output results.json
- name: Upload results
upload:
path: results.json
destination: drive://bucket/processed/results.json
Documentation guide
Core concepts
| Topic | Description |
|---|---|
| Workflow structure | Top-level workflow configuration and fields |
| Jobs | Job configuration, runners, and dependencies |
| Steps | Step types: shell commands, uploads, and downloads |
Features
| Topic | Description |
|---|---|
| Inputs | Define typed parameters for workflows |
| Triggers | Schedule workflows with cron expressions |
| Expressions | Use dynamic values with ${{ }} syntax |
| Concurrency | Prevent simultaneous workflow executions |
| Reusable workflows | Call workflows from other workflows |
| Matrix jobs | Run jobs across multiple configurations |
| Storage | Upload and download files with H2O Drive |
| Failure handling | Control behavior when jobs or steps fail |
| Timeouts | Set maximum execution time for jobs and steps |
| Environment variables | Configure key-value pairs for runtime |
| Secrets | Access sensitive data from H2O Secure Store |
Reference
| Topic | Description |
|---|---|
| Schema reference | Complete field definitions and types |
Feedback
- Submit and view feedback for this page
- Send feedback about H2O Orchestrator | Docs to cloud-feedback@h2o.ai