Pivot tables¶
Use this function to pivot tables. This is performed by designating three columns: index, column, and value.
Index is the column where pivoted rows should be aligned on.
Column represents the column to pivot.
Value specifies the values of the pivoted table.
For cases with multiple indexes for a column label, the aggregation method is to pick the first occurrence in the data frame.
Note
All rows of a single index value must fit on one node.
The maximum rows for a single index value and column label is
Chunk size * Chunk size
.
import h2o
h2o.init()
# Create a simple data frame by inputting values
df = h2o.H2OFrame({'colorID': ['1','2','3','3','1','4'],
'value': ['red','orange','yellow','yellow','red','blue'],
'amount': ['4','2','4','3','6','3']})
# View the dataset
df
colorID amount value
--------- -------- -------
1 4 red
2 2 orange
3 4 yellow
3 3 yellow
1 6 red
4 3 blue
[6 rows x 3 columns]
# Pivot the table on the colorID column and aligned on the amount column
df2 = df.pivot(index="amount",column="colorID",value="value")
df2
amount 1 2 3 4
-------- --- --- --- ---
2 nan 1 nan nan
3 nan nan 3 0
4 2 nan 3 nan
6 2 nan nan nan
[4 rows x 5 columns]
library(h2o)
h2o.init()
# Create a simple data frame by inputting values
data <- data.frame(colorID = c('1', '2', '3', '3', '1', '4'),
value = c('red', 'orange', 'yellow', 'yellow', 'red', 'blue'),
amount = c('4', '2', '4', '3', '6', '3'))
df <- as.h2o(data)
# View the dataset
df
colorID value amount
1 1 red 4
2 2 orange 2
3 3 yellow 4
4 3 yellow 3
5 1 red 6
6 4 blue 3
[6 rows x 3 columns]
# Pivot the table on the colorID column and aligned on the amount column
df2 <- h2o.pivot(df, index = "amount", column = "colorID", value = "value")
df2
amount 1 2 3 4
1 2 NaN 1 NaN NaN
2 3 NaN NaN 3 0
3 4 2 NaN 3 NaN
4 6 2 NaN NaN NaN
[4 rows x 5 columns]