Skip to main content

Scheduled notebook

A two-job workflow that runs every hour. The first job (publish) builds a hello-world notebook in-process with nbformat and uploads it to H2O Drive. The second job (execute) downloads that notebook from Drive in a separate runner and executes it headlessly with papermill.

The Drive round-trip is deliberate. Steps inside a single job share a filesystem; different jobs do not. The pattern shown here — write artifact to Drive in job A, read it back in job B — is how you share data between jobs. See Storage.

This example is scheduled, not callable: it runs itself on a cron trigger rather than being invoked from another workflow.

Workflow

id: scheduled-hello-notebook
name: "Demo: Notebook publish + execute via Drive"

trigger:
schedule:
- cron: "0 * * * *" # top of every hour, UTC

jobs:
publish:
name: "Build notebook and upload to Drive"
timeout: "5m"
steps:
- name: "Install nbformat"
run: sudo uv pip install --system nbformat

- name: "Create notebook"
run: |
python3 -c "
import nbformat as nbf
nb = nbf.v4.new_notebook()
nb.cells = [nbf.v4.new_code_cell(\"print('hello world')\")]
nbf.write(nb, 'hello.ipynb')
"

- name: "Upload notebook source to Drive"
upload:
path: hello.ipynb
destination: drive://default/notebooks/hello.ipynb

execute:
name: "Download notebook from Drive and execute"
depends_on: [publish]
timeout: "10m"
steps:
- name: "Install papermill + kernel"
run: sudo uv pip install --system papermill ipykernel

- name: "Download notebook source from Drive"
download:
source: drive://default/notebooks/hello.ipynb
path: ./hello.ipynb

- name: "Execute notebook"
run: papermill ./hello.ipynb ./out.ipynb --kernel python3 --log-output

What to change

  • The cron expression in trigger.schedule[0].cron — adjust to your cadence.
  • drive://default/notebooks/hello.ipynb — point at your own Drive bucket / path.
  • Replace the inline nbformat notebook-builder with a download step that pulls your real .ipynb from Drive or a Git repository.

Feedback