Concepts
Encoders
One-hot encoder
One-hot encode is a process where categorical variables are converted to a new categorical column while assigning a binary value of 1 or 0 to those columns.
Before one-hot encode > After one-hot encode
Color | > | Yellow | Green | Red |
---|---|---|---|---|
Yellow | > | 1 | 0 | 0 |
Green | > | 0 | 1 | 0 |
Red | > | 0 | 0 | 1 |
Label encoder
Label encoding refers to converting labels of a column into a numeric form to follow a machine-readable form. The label encoder can normalize labels. It can also be used to transform non-numerical labels into numerical labels as long as the non-numerical labels are hashable and comparable.
Before label encoder > After label encoder
Color | > | Color |
---|---|---|
Yellow | > | 1 |
Green | > | 2 |
Red | > | 3 |
Run-length encoder
Run-length encoding (RLE) refers to the type of data compression which takes a string of identical values and replaces it with codes to indicate the value and the number of times it occurs in the string. In particular, RLE is lossless, which refers to the idea that when decompressed, all of the original data (string) is recovered when decoded. For example: FFFQQQC
-> 3F3Q1C
.
- For more information, see Run-Length Encoding (RLE).
- To learn how to decode RLE's, see Run Length Decoding - Quick Start.
Classification tasks
Suported classification tasks are as follows:
To learn which problem types support one, two, or all of the supported classification tasks, see Supported problem types.
Binary
Binary classification refers to a task that has two class labels. A single class label is predicted for each example in a binary classification task. In other words, a single column with 0/1 values.
Multi-class
Multi-class classification refers to a task that has more than two class labels. A single class label is predicted for each example in a multi-class classification task. In other words, multiple columns where one column has to be 1.
Multi-label
Multi-label classification refers to a task with two or more class labels, where you may predict one or more class labels for each example. In other words, multiple columns where any column can be 0/1.
- Submit and view feedback for this page
- Send feedback about H2O Hydrogen Torch to cloud-feedback@h2o.ai