Collection lifecycle
Overview​
Enterprise h2oGPTe provides automated collection lifecycle management to help administrators control data retention and storage usage. Collections follow a three-state lifecycle managed by automated background processes, with administrator controls for expiration policies, inactivity thresholds, and recovery.
Collection states​
Collections transition through three states before permanent deletion:
| From | To | Trigger |
|---|---|---|
| active | expiring | The expiry date passes or the inactivity interval elapses. |
| expiring | archived | The expiration window (expiration_limit_days) has elapsed. |
| archived | Deleted | Automatic cleanup process removes the collection data. |
| expiring or archived | active | An administrator recovers the collection. |
Automatic cleanup runs periodically as a background process. Once the cleanup process deletes a collection, you can't recover it.
Lifecycle settings​
The following settings control system-wide collection lifecycle behavior.
| Setting | Description |
|---|---|
expiration_limit_days | Number of days before expiring collections are archived. Changing this setting triggers re-evaluation of all collection statuses. |
default_collection_inactivity_days | Days of inactivity before a collection begins the expiration process. Set to -1 to turn off inactivity-based expiration. |
default_collection_size_limit | Default maximum storage per collection (in bytes). Range: 1 MB to 10 GB. |
enable_adhoc_collection_expiration | Turn on automatic expiration for ad-hoc (agent-created) collections. |
adhoc_collection_expiration_days | Number of days before agent-created collections expire. Range: 1 to 30. |
Configure lifecycle settings​
# Set collection expiration window to 90 days
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/configurations/expiration_limit_days" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"string_value": "90"}'
# Enable inactivity-based cleanup at 90 days
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/configurations/default_collection_inactivity_days" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"string_value": "90"}'
# Set collection size limit to 500 MB
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/configurations/default_collection_size_limit" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"string_value": "500000000"}'
# Enable automatic cleanup for agent-created collections
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/configurations/enable_adhoc_collection_expiration" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"string_value": "true"}'
Per-collection controls​
Individual collections can have their own lifecycle settings that override system defaults.
Set collection expiry date​
Set an explicit expiry date on a collection, overriding the system default:
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}/expiry_date" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"expiry_date": "2026-12-31"}'
Set collection inactivity interval​
Override the system-wide inactivity threshold for a specific collection:
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}/inactivity_interval" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"inactivity_interval": 30}'
Set collection size limit​
Set a maximum storage limit for a specific collection (in bytes):
curl -X PUT "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}/size_limit" \
-H "Authorization: Bearer <API_KEY>" \
-H "Content-Type: application/json" \
-d '{"size_limit": "500000000"}'
Remove collection size limit​
Remove the size limit from a collection, reverting to the system default:
curl -X DELETE "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}/size_limit" \
-H "Authorization: Bearer <API_KEY>"
Recover a collection​
Administrators can recover collections from the expiring or archived state, restoring them to active. Use this when the system expired a collection unintentionally or when you still need the data.
You can only recover collections before the automatic cleanup process permanently deletes them. Once the process deletes a collection, you can't recover it.
Recover with the REST API​
Restore a collection to active status by providing its collection ID:
curl -X POST "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}/unarchive" \
-H "Authorization: Bearer <API_KEY>"
Recover with the Python SDK​
Use the Python SDK to recover a collection programmatically:
from h2ogpte import H2OGPTE
admin = H2OGPTE(address="https://<YOUR_DOMAIN>", api_key="<API_KEY>")
# Recover an expiring or archived collection
admin.unarchive_collection(collection_id="<COLLECTION_ID>")
Archive a collection​
Administrators can manually archive a collection, bypassing the automatic expiration process.
Archive with the REST API​
curl -X POST "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}/archive" \
-H "Authorization: Bearer <API_KEY>"
Archive with the Python SDK​
from h2ogpte import H2OGPTE
admin = H2OGPTE(address="https://<YOUR_DOMAIN>", api_key="<API_KEY>")
admin.archive_collection(collection_id="<COLLECTION_ID>")
Manage collections as an administrator​
Administrators can list and manage all collections in the system regardless of ownership.
List all collections​
Retrieve all collections with pagination:
curl -s "https://<YOUR_DOMAIN>/api/v1/collections/all?offset=0&limit=100" \
-H "Authorization: Bearer <API_KEY>"
Delete a collection​
Permanently remove a collection and its associated data:
curl -X DELETE "https://<YOUR_DOMAIN>/api/v1/collections/{collection_id}" \
-H "Authorization: Bearer <API_KEY>"
Configure lifecycle with the Python SDK​
The following example configures system-wide lifecycle settings using the Python SDK:
from h2ogpte import H2OGPTE
admin = H2OGPTE(address="https://<YOUR_DOMAIN>", api_key="<API_KEY>")
# Set collection expiration window
admin.set_global_configuration(
"expiration_limit_days", "90", can_overwrite=False, is_public=True
)
# Enable inactivity-based cleanup at 90 days
admin.set_global_configuration(
"default_collection_inactivity_days", "90", can_overwrite=False, is_public=True
)
# Set collection size limit (500 MB)
admin.set_global_configuration(
"default_collection_size_limit", "500000000", can_overwrite=False, is_public=True
)
# Enable automatic cleanup for agent-created collections
admin.set_global_configuration(
"enable_adhoc_collection_expiration", "true", can_overwrite=False, is_public=True
)
Audit and telemetry events​
The system tracks collection operations through two mechanisms: the audit trail for security-relevant operations and telemetry for lifecycle monitoring.
Audit trail events​
The following operations produce formal audit trail records that administrators can query:
| Event | Trigger |
|---|---|
create_collection | A user creates a new collection. |
update_collection | A user updates collection metadata. |
update_collection_settings | A user updates collection settings. |
make_collection_public | A user makes a collection accessible to all users. |
make_collection_private | A user removes public access from a collection. |
share_collection | A user shares a collection with another user. |
share_collection_with_group | A user shares a collection with a group. |
unshare_collection | A user removes another user's access to a collection. |
unshare_collection_from_group | A user removes a group's access to a collection. |
unshare_collection_for_all | A user removes all shared access from a collection. |
Telemetry events​
The following lifecycle operations emit telemetry events for monitoring but do not produce audit trail records:
| Event | Trigger |
|---|---|
CollectionArchived | A collection transitions to the archived state. |
CollectionRecovered | An administrator recovers a collection to active status. |
ArchivedCollectionDeleted | The cleanup process permanently deletes an archived collection. |
Related topics​
- System Settings - Configure global lifecycle settings
- Manage Collections - System Dashboard collection management
- Rate Limits and Fairness - Collection and document limits
- Delete a Collection - Delete collections through the UI
- User Data Deletion - Delete all user data including collections
- Submit and view feedback for this page
- Send feedback about Enterprise h2oGPTe to cloud-feedback@h2o.ai