Editor tour
This tour is meant to be read alongside the YAML editor. You already have a hello-world workflow open on the left — each section below introduces one concept and gives you a complete workflow you can paste-replace into the editor to see it in action.
Every code block in this tour is a complete workflow. Select all in the editor, paste the snippet, and run. No indentation gymnastics.
The starter workflow
id: hello-world
name: Hello World
jobs:
greet:
name: Say hello
steps:
- name: Print greeting
run: echo "Hello from H2O Workflows!"
Five keys carry the whole model.
id— a unique identifier for this workflow within the workspace. Other workflows reference it by this id.name— a human-readable label. Optional; shown in the UI and logs.jobs— a map of jobs, keyed by id. Each job is an independent unit of work and, by default, runs in parallel with sibling jobs.steps— an ordered list of actions inside a job. Steps run sequentially; each one starts a fresh shell.run— the shell command for this step. A step must specify exactly one ofrun,upload, ordownload.
That's the entire mental model. Everything else is configuration on top of these.
1. Change the greeting
Find the echo line in the editor and edit it. The output appears in the run logs panel.
id: hello-world
name: Hello World
jobs:
greet:
name: Say hello
steps:
- name: Print greeting
run: echo "Hello, $(whoami)!"
2. Add a second step
Steps in a job execute top-to-bottom. State (working directory, env vars, processes) is not carried between steps — every step starts a fresh shell.
id: hello-world
name: Hello World
jobs:
greet:
name: Say hello
steps:
- name: Print greeting
run: echo "Hello from H2O Workflows!"
- name: Show the date
run: date -u
If a step fails, the rest of the job stops by default. Set continue_on_error: true on a step to keep going. See Failure handling.
3. Add a second job
Jobs run in parallel unless you declare a dependency. depends_on makes the second job wait until the first succeeds — remove it and they run concurrently.
id: hello-world
name: Hello World
jobs:
greet:
name: Say hello
steps:
- name: Print greeting
run: echo "Hello!"
farewell:
name: Say goodbye
depends_on: [greet]
steps:
- name: Print farewell
run: echo "Goodbye!"
See Jobs for runners, timeouts, and dependency graphs.
4. Take an input
Inputs are typed parameters supplied at trigger time. Reference them in YAML with ${{ .inputs.<name> }} (Go template syntax). Supported types are string, bool, and int.
id: hello-world
name: Hello World
inputs:
who:
type: string
required: true
description: "Who to greet"
jobs:
greet:
name: Say hello
steps:
- name: Print greeting
run: echo "Hello, ${{ .inputs.who }}!"
See Inputs and Expressions.
5. Move files in and out with Drive
upload and download steps move files between the runner and H2O Drive. Drive paths use the form drive://<bucket-uuid>/<path>.
Use this to share artifacts between jobs (the runner filesystem does not persist across jobs) and to keep results after the run completes.
id: hello-world
name: Hello World
jobs:
process:
name: Process a dataset
steps:
- name: Download dataset
download:
source: drive://my-bucket/datasets/raw.csv
path: ./data.csv
- name: Process it
run: |
wc -l data.csv > result.txt
cat result.txt
- name: Upload result
upload:
path: result.txt
destination: drive://my-bucket/outputs/result.txt
Replace my-bucket with one of your H2O Drive bucket UUIDs. See Storage.
6. Run it on a schedule
Add a trigger.schedule with one or more cron expressions. You can also pass inputs per schedule entry.
id: hello-world
name: Hello World
trigger:
schedule:
- cron: "0 6 * * *" # every day at 06:00 UTC
jobs:
greet:
name: Say hello
steps:
- name: Print greeting
run: echo "Hello from H2O Workflows!"
See Triggers.
7. Use a secret
Secrets come from H2O Secure Store and are masked in logs. Declare them at the workflow level and reference them in steps with ${{ .secrets.<name> }}.
id: hello-world
name: Hello World
secrets:
api_token:
resource: secrets/my-team/api-token
jobs:
call-api:
name: Call an API
steps:
- name: Fetch data
run: |
curl -sS \
-H "Authorization: Bearer ${{ .secrets.api_token }}" \
https://api.example.com/data
See Secrets.
8. Talk to an H2O product
The runner injects a platform access token into every step as H2O_CLOUD_CLIENT_PLATFORM_TOKEN. You don't declare it — it's already there, scoped to the workspace running the workflow. Pass it to the login() helper of any H2O Python client.
id: hello-world
name: Hello World
jobs:
list-engines:
name: List DAI engines
steps:
- name: Install the AI Engine Manager client
run: sudo uv pip install --system h2o-engine-manager
- name: List my engines
run: |
python3 -c "
import os, h2o_engine_manager
clients = h2o_engine_manager.login(
platform_token=os.environ['H2O_CLOUD_CLIENT_PLATFORM_TOKEN'],
)
print(clients.dai_engine_client.list_engines())
"
Two things worth noticing.
sudo uv pip install --systemis the install pattern on this runner.pipis not pre-installed; the system Python is owned by root;uvis the supported installer.- The install in step 1 carries into step 2. Steps share the runner's filesystem, so
--systempackages survive — even though every step starts a fresh shell. Anything written to$PATH,cd, orexportdoes not carry over.
See Runner environment for the full list of pre-installed tools and per-product client install commands (MLOps, H2O-3, Feature Store, Driverless AI).
Where to go next
Once you're past hello-world, the topics below cover everything else.
Core syntax
- Workflow structure — every top-level field.
- Jobs — runners,
depends_on, timeouts. - Steps —
run,upload,download,working_dir,env. - Runner environment — auto-injected token, pre-installed tools, installing Python clients.
- Expressions — the
${{ }}template syntax.
Parameterization
- Inputs — typed parameters.
- Environment variables — static config.
- Secrets — sensitive values.
Scaling out
- Matrix jobs — fan out a job over parameter combinations.
- Reusable workflows — call one workflow from another.
- Concurrency — prevent overlapping runs.
Operating workflows
- Triggers — cron schedules.
- Storage — H2O Drive paths and glob patterns.
- Timeouts — bound runtime at step, job, and workflow levels.
- Failure handling —
continue_on_errorandcancel_on_failure.
End-to-end examples
- Simple pipeline — three-job pipeline with inputs and triggers.
- Matrix data processing — fan-out processing.
- Reusable workflow — workflow-call patterns.
Reference
- Schema reference — complete CUE schema.
- API reference — Python SDK and REST API.
- Submit and view feedback for this page
- Send feedback about H2O Workflows to cloud-feedback@h2o.ai