Storage
Upload and download steps enable transferring files and folders to and from H2O Drive during workflow execution.
Overview
Steps can upload and download files using H2O Drive:
- Upload to H2O Drive: Transfer local files and folders to H2O Drive
- Download from H2O Drive: Transfer files and folders from H2O Drive to the local filesystem
These are explicit step types, distinct from shell command execution. Currently, only H2O Drive is supported as a storage backend.
Upload
Upload local files or folders to H2O Drive storage.
Fields
path (required)
Local filesystem path or glob pattern to upload.
Supports:
- Single files:
model.pkl - Folders:
models/(uploads recursively) - Glob patterns:
*.log,models/**/*.pkl
Examples:
path: model.pkl # Single trained model
path: models/ # Entire models folder
path: "*.json" # All metric files in current directory
path: "checkpoints/**" # All checkpoints recursively
destination (required)
Destination in H2O Drive.
Format: drive://bucket-name/path/to/destination
Examples:
destination: drive://bucket/models/trained_model.pkl
destination: drive://bucket/experiments/exp-123/models/
Path behavior
Single file:
upload:
path: training.log
destination: drive://bucket/logs/training.log
# Local: training.log → Drive: drive://bucket/logs/training.log
Folder (recursive):
upload:
path: models/
destination: drive://bucket/trained-models/models/
# Local: models/model.pkl → Drive: drive://bucket/trained-models/models/model.pkl
# Local: models/checkpoints/epoch_10.h5 → Drive: drive://bucket/trained-models/models/checkpoints/epoch_10.h5
# Structure under models/ is preserved
Standard Drive behavior:
- The base path (
models/) is stripped - Structure under it is preserved at the destination
- Trailing
/on destination indicates a directory or prefix
Glob pattern:
upload:
path: "*.log"
destination: drive://bucket/logs/
# Local: training.log → Drive: drive://bucket/logs/training.log
# Local: evaluation.log → Drive: drive://bucket/logs/evaluation.log
upload:
path: "models/**/*.pkl"
destination: drive://bucket/model-artifacts/
# Local: models/classifier.pkl → Drive: drive://bucket/model-artifacts/classifier.pkl
# Local: models/ensemble/voting.pkl → Drive: drive://bucket/model-artifacts/ensemble/voting.pkl
# Base (models/) is stripped, structure under it preserved
Download
Download files or folders from H2O Drive storage to the local filesystem.
Fields
source (required)
Drive URL source in H2O Drive.
Format: drive://bucket-name/path/to/source
Examples:
source: drive://bucket/models/trained_model.pkl
source: drive://bucket/datasets/processed-data/
path (required)
Local filesystem destination path.
Examples:
path: ./model.pkl # Download to specific file
path: ./data/ # Download to folder
Path behavior
Single file:
download:
source: drive://bucket/models/trained_model.pkl
path: ./model.pkl
# Drive: drive://bucket/models/trained_model.pkl → Local: ./model.pkl
Folder (recursive):
download:
source: drive://bucket/training-artifacts/models/
path: ./models/
# Drive: drive://bucket/training-artifacts/models/model.pkl → Local: ./models/model.pkl
# Drive: drive://bucket/training-artifacts/models/features/engineered.parquet → Local: ./models/features/engineered.parquet
# Structure is preserved
Download to different path:
download:
source: drive://bucket/datasets/training-data/
path: ./data/
# Drive: drive://bucket/datasets/training-data/features.csv → Local: ./data/features.csv
Drive URL format
H2O Drive URLs follow the standard format:
drive://bucket-name/path/to/object
Components:
| Component | Description |
|---|---|
drive:// | Protocol prefix (required) |
bucket-name | Name of the H2O Drive bucket |
/path/to/object | Object key or path within the bucket |
Common patterns
Share artifacts between jobs
jobs:
train:
steps:
- name: Upload model
upload:
path: models/
destination: drive://ml-artifacts/models/
evaluate:
depends_on: [train]
steps:
- name: Download model
download:
source: drive://ml-artifacts/models/
path: ./models/
Download dataset and upload results
steps:
- name: Download training data
download:
source: drive://datasets/features.parquet
path: ./data/features.parquet
- name: Train and upload metrics
run: python train.py
- name: Upload results
upload:
path: "*.json"
destination: drive://results/
Glob patterns
Supported glob patterns:
| Pattern | Description | Example |
|---|---|---|
* | Match any characters (except /) | *.json matches metrics.json, config.json |
** | Match any characters (including /) | models/**/*.pkl matches all .pkl files recursively |
? | Match single character | model?.pkl matches model1.pkl, modelA.pkl |
[abc] | Match character set | checkpoint[123].pkl matches checkpoint1.pkl, checkpoint2.pkl |
- Submit and view feedback for this page
- Send feedback about H2O Orchestrator | Docs to cloud-feedback@h2o.ai