Skip to main content

Platform Services

This page lists out the platform services used in H2O AI Hybrid Cloud.

Apache Druid

Apache Druid is a high-performance analytics database used for analyzing large datasets in real-time. It works best with event-oriented data. Druid is used with H2O AI Hybrid Cloud as the Telemetry backend instead of PostgreSQL.

Kafka

Kafka is an open-source distributed event streaming platform and is used within H2O AI Hybrid Cloud for telemetry and event streaming. It is used in H2O Document AI to send scoring requests and also used to pass messages between the Telemetry Server and the Apache Druid component.

AuthZ

H2O AuthZ (also known as "Workspaces") is a central workspace and access control service for H2O AI Cloud. It can be used as a mechanism for resource-sharing and collaboration within the H2O AI Cloud platform.

About workspaces
  • Personal Workspaces: Each user in the platform starts with access to their own, private workspace. Users cannot add other users to their personal workspaces. It can be considered as the catch-all for general, unorganized work.

  • Roles in Workspaces: Within a specific workspace, each user has permissions that allow them to do different actions in the workspace. A workspace inherits any platform roles that the user is assigned to, which may restrict certain actions (e.g., if a user's platform role does not allow them to deploy models, the user cannot do that in a particular workspace even if the workspace admin wants to grant them this permission).

  • Authorization is Workspace-related: There is no resource-specific authorization. For example, if a user can view models in a workspace, they can view all models; and if a user can edit feature sets in the workspace, the user can edit all feature sets.

H2O Drive

H2O Drive is used as your personal object storage for H2O AI Hybrid Cloud. You can use H2O Drive to store your datasets and import them seamlessly to AI engines such as H2O Driverless AI or H2O-3 when building models.


Feedback