Storage
Upload and download steps enable transferring files and folders to/from H2O Drive during workflow execution.
Overview
Steps can upload and download files using H2O Drive:
- Upload to H2O Drive: Transfer local files/folders to H2O Drive.
- Download from H2O Drive: Transfer files/folders from H2O Drive to local filesystem.
These are explicit step types, distinct from shell command execution. Currently, only H2O Drive is supported as a storage backend.
Schema
See Schema Reference for the complete #UploadH2ODrive and #DownloadH2ODrive definitions.
Upload
Upload local files or folders to H2O Drive storage.
Fields
path (required)
Local filesystem path or glob pattern to upload.
Supports:
- Single files:
model.pkl. - Folders:
models/(uploads recursively). - Glob patterns:
*.log,models/**/*.pkl.
Examples:
path: model.pkl # Single trained model
path: models/ # Entire models folder
path: "*.json" # All metric files in current dir
path: "checkpoints/**" # All checkpoints recursively
destination (required)
Destination in H2O Drive.
Format: drive://bucket-name/path/to/destination
Examples:
destination: drive://bucket/models/trained_model.pkl
destination: drive://bucket/experiments/exp-123/models/
Path Behavior
Single File:
upload:
path: training.log
destination: drive://bucket/logs/training.log
# Local: training.log → Drive: drive://bucket/logs/training.log
Folder (Recursive):
upload:
path: models/
destination: drive://bucket/trained-models/models/
# Local: models/model.pkl → Drive: drive://bucket/trained-models/models/model.pkl
# Local: models/checkpoints/epoch_10.h5 → Drive: drive://bucket/trained-models/models/checkpoints/epoch_10.h5
# Structure under models/ is preserved
Standard Drive Behavior:
- The base path (
models/) is stripped. - Structure under it is preserved at destination.
- Trailing
/on destination indicates a directory/prefix.
Glob Pattern:
upload:
path: "*.log"
destination: drive://bucket/logs/
# Local: training.log → Drive: drive://bucket/logs/training.log
# Local: evaluation.log → Drive: drive://bucket/logs/evaluation.log
upload:
path: "models/**/*.pkl"
destination: drive://bucket/model-artifacts/
# Local: models/classifier.pkl → Drive: drive://bucket/model-artifacts/classifier.pkl
# Local: models/ensemble/voting.pkl → Drive: drive://bucket/model-artifacts/ensemble/voting.pkl
# Base (models/) is stripped, structure under it preserved
Download
Download files or folders from H2O Drive storage to local filesystem.
Fields
source (required)
Drive URL source in H2O Drive.
Format: drive://bucket-name/path/to/source
Examples:
source: drive://bucket/models/trained_model.pkl
source: drive://bucket/datasets/processed-data/
path (required)
Local filesystem destination path.
Examples:
path: ./model.pkl # Download to specific file
path: ./data/ # Download to folder
Path Behavior
Single File:
download:
source: drive://bucket/models/trained_model.pkl
path: ./model.pkl
# Drive: drive://bucket/models/trained_model.pkl → Local: ./model.pkl
Folder (Recursive):
download:
source: drive://bucket/training-artifacts/models/
path: ./models/
# Drive: drive://bucket/training-artifacts/models/model.pkl → Local: ./models/model.pkl
# Drive: drive://bucket/training-artifacts/models/features/engineered.parquet → Local: ./models/features/engineered.parquet
# Structure is preserved
Download to Different Path:
download:
source: drive://bucket/datasets/training-data/
path: ./data/
# Drive: drive://bucket/datasets/training-data/features.csv → Local: ./data/features.csv
Drive URL Format
H2O Drive URLs follow the standard format:
drive://bucket-name/path/to/object
Components:
drive://— Protocol prefix (required).bucket-name— Name of the H2O Drive bucket./path/to/object— Object key/path within the bucket.
Common Patterns
Share Artifacts Between Jobs
jobs:
train:
steps:
- name: Upload model
upload:
path: models/
destination: drive://ml-artifacts/models/
evaluate:
depends_on: [train]
steps:
- name: Download model
download:
source: drive://ml-artifacts/models/
path: ./models/
Download Dataset and Upload Results
steps:
- name: Download training data
download:
source: drive://datasets/features.parquet
path: ./data/features.parquet
- name: Train and upload metrics
run: python train.py
- name: Upload results
upload:
path: "*.json"
destination: drive://results/
Glob Patterns
Supported glob patterns:
| Pattern | Description | Example |
|---|---|---|
* | Match any characters (except /) | *.json matches metrics.json, config.json |
** | Match any characters (including /) | models/**/*.pkl matches all .pkl files recursively |
? | Match single character | model?.pkl matches model1.pkl, modelA.pkl |
[abc] | Match character set | checkpoint[123].pkl matches checkpoint1.pkl, checkpoint2.pkl |
- Submit and view feedback for this page
- Send feedback about H2O Workflows to cloud-feedback@h2o.ai