Reusable workflow

This guide walks you through a reusable workflow that other workflows can call as a step. You will learn how to make a workflow callable, pass secret inputs, reference workspace secrets, and handle errors gracefully with continue_on_error.

Prerequisites

H2O AI Cloud access (see Access H2O Workflows).
Python SDK installed: pip install h2o-workflows.
A workspace secret named service-token configured in H2O Secure Store.
An API key for the external service.

Step 1: Understand the workflow

Three jobs run in sequence: download-and-validate fetches and validates input data, transform applies transformations using an external API (with a secret-authenticated service call that tolerates failures), and finalize publishes the output. The workflow is callable — it cannot run on its own but is invoked by other workflows via workflow_call steps.

Step 2: Walk through the YAML

Callable trigger

Setting callable: true makes this workflow a reusable building block:

trigger:
  callable: true

Other workflows invoke it using a workflow_call step and pass inputs at call time. This workflow has no cron schedule — it only runs when called. See Reusable Workflows and Triggers.

Secret inputs

The api_key input uses secret: true to tell H2O Workflows to mask this value in logs and the UI:

api_key:
  type: string
  required: true
  secret: true
  description: "API key for external service"

The secret: true flag does not change how the value is passed — it controls visibility. See Inputs.

Workspace secrets

The secrets block fetches a secret from H2O Secure Store and makes it available inside job steps:

secrets:
  - name: workspaces/019a55f6-2c62-746f-a49b-4e42f470f26c/secrets/service-token
    as: service_token

Once declared, the secret is accessible via ${{ .secrets.service_token }} in any job step. See Secrets.

Using secrets in steps

The transform job exposes both the secret input and the workspace secret as environment variables:

env:
  API_KEY: ${{ .inputs.api_key }}
  SERVICE_TOKEN: ${{ .secrets.service_token }}

Injecting secrets as environment variables keeps them out of the YAML source and out of command-line arguments. See Expressions.

Error handling with continue_on_error

The external service call is marked continue_on_error: true so that a failure does not stop the job:

- name: Call external service
  run: python scripts/call_service.py --data data/output/ --token "$SERVICE_TOKEN"
  continue_on_error: true

If this step fails, the job continues to the next step instead of failing immediately. Use this pattern for non-critical operations where partial failure is acceptable. See Failure Handling.

Step 3: Deploy with the Python SDK

import h2o_workflows

clients = h2o_workflows.login()

with open("examples/reusable-workflow.yaml") as f:
    source = f.read()

from h2o_workflows.workflow.workflow import Workflow

workflow = clients.workflow.create_workflow(
    parent="workspaces/my-workspace",
    workflow=Workflow(source_contents=source),
)
print(f"Created: {workflow.name}")

# Activate so other workflows can call it
clients.workflow.activate_workflow(name=workflow.name)
print("Reusable workflow activated -- ready to be called")

Since this workflow uses trigger.callable: true, it will not run on its own. Other workflows invoke it using a workflow_call step. For the full client API, see the Python SDK Reference.

Complete YAML

id: reusable-workflow
name: Reusable Workflow

trigger:
  callable: true

inputs:
  source_bucket:
    type: string
    required: true
    description: "Source Drive Workspace bucket"

  source_path:
    type: string
    required: true
    description: "Path within source bucket"

  destination_bucket:
    type: string
    required: true
    description: "Source Drive Workspace bucket"

  api_key:
    type: string
    required: true
    secret: true
    description: "API key for external service"

secrets:
  - name: workspaces/019a55f6-2c62-746f-a49b-4e42f470f26c/secrets/service-token
    as: service_token

env:
  SCRIPTS_REPO: "https://github.com/h2oai/project-scripts.git"

jobs:
  download-and-validate:
    name: Download and validate input
    runner: cpu-medium
    timeout: "20m"

    steps:
      - name: Download source data
        download:
          source: drive://${{ .inputs.source_bucket }}/${{ .inputs.source_path }}
          path: ./data/input/

      - name: Clone scripts
        run: git clone --depth 1 $SCRIPTS_REPO scripts

      - name: Install dependencies
        run: pip install -r scripts/requirements.txt

      - name: Validate input
        run: python scripts/validate.py --input data/input/

      - name: Upload validated data
        upload:
          path: data/input/
          destination: drive://${{ .inputs.destination_bucket }}/validated/

  transform:
    name: Transform data
    depends_on: [download-and-validate]
    runner: cpu-large
    timeout: "30m"

    env:
      API_KEY: ${{ .inputs.api_key }}
      SERVICE_TOKEN: ${{ .secrets.service_token }}

    steps:
      - name: Download validated data
        download:
          source: drive://${{ .inputs.destination_bucket }}/validated/
          path: ./data/input/

      - name: Clone scripts
        run: git clone --depth 1 $SCRIPTS_REPO scripts

      - name: Install dependencies
        run: pip install -r scripts/requirements.txt

      - name: Run transformation
        run: python scripts/transform.py --input data/input/ --output data/output/ --api-key "$API_KEY"

      - name: Call external service
        run: python scripts/call_service.py --data data/output/ --token "$SERVICE_TOKEN"
        continue_on_error: true

      - name: Upload transformed data
        upload:
          path: data/output/
          destination: drive://${{ .inputs.destination_bucket }}/transformed/

  finalize:
    name: Finalize and publish
    depends_on: [transform]
    runner: cpu-small
    timeout: "10m"

    steps:
      - name: Download transformed data
        download:
          source: drive://${{ .inputs.destination_bucket }}/transformed/
          path: ./data/final/

      - name: Clone scripts
        run: git clone --depth 1 $SCRIPTS_REPO scripts

      - name: Install dependencies
        run: pip install -r scripts/requirements.txt

      - name: Finalize output
        run: python scripts/finalize.py --input data/final/ --output data/published/

      - name: Upload final output
        upload:
          path: data/published/
          destination: drive://${{ .inputs.destination_bucket }}/published/

Next steps

Simple Pipeline — Build a scheduled data pipeline.
Workflow Syntax Reference — Full YAML syntax documentation.

Feedback

Submit and view feedback for this page
Send feedback about H2O Workflows to cloud-feedback@h2o.ai

Prerequisites​

Step 1: Understand the workflow​

Step 2: Walk through the YAML​

Callable trigger​

Secret inputs​

Workspace secrets​

Using secrets in steps​

Error handling with continue_on_error​

Step 3: Deploy with the Python SDK​

Complete YAML​

Next steps​