This section provides an overview of each algorithm available in H2O. For detailed information about the parameters that can be used for building models, refer to Appendix A - Parameters.
In supervised learning, the dataset is labeled with the answer that algorithm should come up with. Supervised learning takes input variables (x) along with an output variable (y). The output variable represents the column that you want to predict on. The algorithm then uses these variables to learn and approximate the mapping function from the input to the output. Supervised learning algorithms support classification and regression problems.
H2O supports the following supervised algorithms:
- H2O AutoML: Automatic Machine Learning
- Cox Proportional Hazards (CoxPH)
- Deep Learning (Neural Networks)
- Distributed Random Forest (DRF)
- Generalized Linear Model (GLM)
- Isotonic Regression
- Generalized Additive Models (GAM)
- ANOVA GLM
- Gradient Boosting Machine (GBM)
- Naïve Bayes Classifier
- Decision Tree
- Stacked Ensembles
- Support Vector Machine (SVM)
- Distributed Uplift Random Forest (Uplift DRF)
In unsupervised learning, the model is provided with a dataset that isn’t labeled - i.e., without an explicit outcome that the algorithm should return. In this case, the algorithm attempts to find patterns and structure in the data by extracting useful features. The model organizes the data in different ways, depending on the algorithm (clustering, anomaly detection, autoencoders, etc).
H2O supports the following unsupervised algorithms: